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Modules and 



1.1 Rings, Domains and Fields 

Definition 1.1.1 A non empty set R is called a ring if R has two binary 
operations, called addition and multiplication, such that for all a,b,c G R 
the following holds: 

(1.1.1) a + 6e R; 

(1.1.2) a + b = b+ a (the commutative law); 

(1.1.3) (a + b) + c = a + (b + c) (the associative law) : 

(1.1.4) 3 e R such that a + = + a = a, V a e i?; 

(1.1. 5) V a e R, 3 - a e R such that a + (-a) = 0; 

(1.1.6) a& e R; 

(1.1.7) a(bc) — (ab)c (the associative law); 

(1.1.8) a(b + c) = ab+ ac, (b + c)a = ba + ca, (the distribution laws) . 

R has an identity element 1 if al = la for all a e R. R is called 
commutative if 

(1.1.9) ab = ba, for all a, b e R. 

Note that the properties (1.1.2) — (1.1.8) imply that aO = 0a = 0. If a and 
b are two nonzero elements such that 



(1.1.10) 



ab = 



1 
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then a and b are called zero divisors. 

Definition 1.1.2 D is called an integral domain if B) is a commutative 
ring without zero divisors and containing identity 1. 

The classical example of an integral domain is the ring of integers Z. In 
this book we shall use the following example of an integral domain. 

Example 1.1.3 Let ft C C™ be a nonempty set. Then H(f2) denotes 
the ring of analytic functions f(zi, z n ) such that for each ( e O there 
exists an open neighborhood 0{f, C) of £ such that f is analytic on 0{f, £)■ 
If fl is open we assume that f is defined only on fl. If fl consists of one 
point C then stands for H({£}). 

Note that zero element is the zero function and the identity clement 
is the constant function which is equal to 1. The properties of analytic 
functions imply that H(fi) is an integral domain if and only if f2 is a con- 
nected set. (f2 is connected if for any open set O D there exists an open 
connected set O' such that O D O' D ft.) In this book we shall assume 
that ft is connected unless otherwise stated. See [Rud74] and [GuR65] for 
properties of analytic functions in one and several complex variables. 

For a, b € D, a divides b, (or a is a divisor of b) , denoted by a | b, if b = ab\ 
for some b\ € D. An clement a is called invertible, (unit, unimodular), if 
a\l. a, b G D are associates, denoted by a = b, if a\b and b\a. Denote 
{{b}} = {a e D : a = b}. The associates of a and units are called improper 
divisors of a. For an invertible a denote by a -1 the unique element such 
that 



/ G H(fi) is invertible if and only if / does not vanish at any point of fi. 

Definition 1.1.4 A field F is an integral domain D such that any non 
zero element is invertible. 

The familiar examples of fields are the set of rational numbers Q, the 
set of real numbers K, and the set of complex numbers C. Given an integral 
domain D there is a standard way to construct the field F of its quotients. 
F is formed by the set of equivalence classes of all quotients | , b ^ such 
that 



(1.1.11) 



a a 



-l 



= a 1 a = 1. 



(1.1.12) 



a c 



ad + be 



a c 



ac 



b,d^0. 




bd 



bd bd' 



Definition 1.1.5 For ft C C n ,C G C" let M{n),M c denote the quo- 
tient fields o/H(Q),H^ respectively. 
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Definition 1.1.6 Let B[xi,...,x„] be the ring of all polynomials in n 
variables with coefficients in D: 

(1.1.13) p(xi,...,X n ) = a aX a , 

|a|<m 

n 

a = (ai, ...,a n ) G Z™ , |a|:=^a i; x a := x" 1 • • • x" n . 

i=l 

The degree of p(x\, ...,x n ) ^ (deg p) is m if there exists a Q ^ such 
that \a\ — m. (deg = 0.) A polynomial p is called homogeneous if a a = 
for all |a| < deg p. It is a standard fact that B[xi,...,x„] is an integral 
domain. (See Problems 2-3.) As usual F(xi, x n ) denotes the quotient 
field of F[xi, x n ]. 

Problems 

1. Let C[a, b] be the set of real valued continuous functions on the inter- 
val [a, b], a < b. Show that C[a, b] is a commutative ring with identity 
and zero divisors. 

2. Prove that D[x] is an integral domain. 

3. Prove that D[xi, x n ] is an integral domain. (Use the previous prob- 
lem and the identity D[xi, x n ] = D[xi, x„_i][x n ].) 

4. Let p(x 1 ,...,x n ) e D[xi, ...,x„]. Show that p = Ej< dogp Pi, where 
each pi is either a zero polynomial or a homogeneous polynomial of 
degree i for i > 1. If p is not a constant polynomial then to = deg p > 
1 and p m 7^ 0. The polynomial p m is called the principle part of p 
and is denoted by p n . (If p is a constant polynomial then p n = p.) 

5. Let p, q e D[xi, ...,x„]. Show (pq) n = p w q n . 

1.2 Bezout Domains 

Let ai, . . . , a„ € D. Assume first that not all of ai, . . . , o„ are equal to zero. 
An element d € D is a greatest common divisor (g.c.d) of a\, ...,a n if rf|a, 
for i=l, ...,n, and for any d' such that d'\a,i,i — 1, ...,n, Denote by 

(ai, a„) any g.c.d. of ai, a n . Then {{(ai, . . . , a„)}} is the equivalence 
class of all g.c.d. of a\, . . . , a n . For a\ = . . . = a n = 0, we define to be the 
g.c.d. of a\, . . . , a n , i.e. (a 1; a n ) = 0. The elements a i, a n are called 
coprime if {{(oi, a„)}} = {{!}}. 
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Definition 1.2.1 D is called a greatest common divisor domain, or 
simply CCD domain and denoted by Do, if an V two elements in D have 
a g.c.d.. 

A simple example of is Z. Sec Problem 5 for a non GCD domain. 

Definition 1.2.2 A subset I C D is called an ideal if for any a,b £ I 

and p, q £ D the element pa + qb belongs to I. 

In Z any nontrivial ideal is the set of all numbers divisible by an integer 
k ^ 0. In H(f2), the set of functions which vanishes on a prescribed set 
U C n, i.e. 



is an ideal. Ideal in / is called prime if ab £ I implies that either a or b is 
in /. / C Z is a prime ideal if and only if I is the set of integers divisible 
by some prime number p. An ideal / is called maximal if the only ideals 
which contain / are / and D. / is called finitely generated if there exists k 
elements (generators) p\, ...,Pk £ I such that any i £ I is of the form 

(1.2.2) i = a 1 p 1 -\ \-a k p k 

for some a\, ...,ak £ D. For example, in D[x, y] the set of all polynomials 
p(x, y) such that 



is an ideal generated by x and y. An ideal is called principal ideal if it is 
generated by one element p. 

Definition 1.2.3 D is called a Bezout domain, or simply BD and de- 
noted byH>B, if any two elements a, b £ D have g.c.d. (a, b) such that 



(1.2.1) 



I(U) := {/ e ff(fi) : /(C) = 0, C G U}, 



(1.2.3) 



p(0,0)=0 



(1.2.4) 



(a, b) = pa + qb, 



for some p,q £ D. 



It is easy to show by induction that for ai, a„ £ 



n 



(1.2.5) 




) = ^Piftj, for some pi,...,p n € Dfi- 



Lemma 1.2.4 An integral domain is a Bezout domain if and only if 
any finitely generated ideal is principal. 
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Proof. Assume that an ideal of Bb is generated by a\,...,a n . Then 

(1.2.5) implies that (ai, ...,a„) € J. Clearly (oi, ...,o n ) is a generator of 7. 
Assume now that any finitely generated ideal of D is principal. For given 
a, b e O let 7 be the ideal generated by a and b. Let d be a generator of 7. 
So 

(1.2.6) d = pa + qb. 

Since d generates I d divides a and b. (1.2.6) implies that if d' divides a 
and 6 then d'|d. Hence d = (a, 6) and D is Ob- □ 

Let I C D[x, y] be the ideal given by (1.2.3). Clearly (x,y) = 1. As 
1 ^ 7, 7 is not principal. As x,y generate 7 we obtain that B[x,y] is not 
Ob- In particular F[xi, x n ] is not Bb for n > 2. The same argument 
shows that H(fl) is not Db for ft C C™ and n > 2. It is a standard fact 
that ¥[x] is a Bezout domain [Lan67]. (See 1.3.) For a connected set C C 
H(f2) is Db- This result is implied by the following interpolation theorem 
[Rud74, Thms 15.11, 15.15]: 

Theorem 1.2.5 Let fl C C be an open set, A C be a countable set 
with no accumulation point in ft. Assume that for each ( G A, m(() and 
Wo£, ...,%((),( are a nonnegative integer and m(£) + 1 complex numbers, 
respectively. Then there exists f € H(f2) such that 

f^(0 = n\w nX , n = 0,...,m(C), for all (e A. 

Furthermore, if all w n ^ = £/ien t/iere exists g e H(f2) such that all zeros 
of g are in A and g has a zero of order m(£) + 1 at each ( e A. 

Theorem 1.2.6 Let ft C C be an open connected set. Then for a, b e 
H(fi) £/iere exists p € 77(fi) smc/i i/i<rf (a, &) = pa + b. 

Proof. If a = or b = then (a, 6) = la+ lb. Assume that ab ^ 0. Let 
A be the set of common zeros of a(z) and b(z). For each ( E Aid m(() + 1 
be the minimum multiplicity of the zero z = ( of a(z) and Theorem 
1.2.5 implies the existence of / e H(fi) which has its zeros at A, such that 
at each ( e A /(z) has a zero of order m(£) + 1. Hence 

a = a/, & = bf, a,b e H(fi). 

Thus d and & do not have common zeros. If A is empty then a — a, b = b. 
Let A be the set of zeros of d. Assume that for each £ e A a has a 
zero of multiplicity n(() + 1. Since b(Q ^ for any ( E A, Theorem 1.2.5 
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implies the existence of a function g e H(fi) which satisfies the interpolation 
conditions: 

|s(^ w )l^ = |i S (*)U=c. fc = o,...,n(C), Cel. 

Then 

e 9 - b 

P= — ~ — , (a,b) = fe 9 = pa + b 
a 

and the theorem follows. □ 

Corollary 1.2.7 Let ft C C be a connected set. Then H(f2) is a Bezout 
domain. 



Problems 

1. Let a, b, c e Dg. Assume that (a, b) = 1, (a, c) = 1. Show that 
(a, 6c) = 1. 

2. Let / be a prime ideal in D. Show that D/7 (the set of all cosets of 
the form I + a) is an integral domain. 

3. Let I an ideal in D. For p e D denote by I(p) the set: 

J(p) := {a e D : a = bp + q, for all be D, q e I}. 

Show that I(p) is an ideal. Prove that / is a maximal ideal if and 
only if for any p I I(p) = D. 

4. Show that an ideal / is maximal if and only if D/I is a field. 

5. Let Zb/^3] ={«eC, a = p + p,qe Z}. Show 

(a) Z[-\/— 3], viewed as a subset of C, is a domain with respect to 
the addition and multiplication in C. 

(b) Let z = a + byf^3 E Z[v/=3]. Then 

\z\ = 1 <*=>■ z = ±1, |z| = 2 z = ±2 or z = ±1 ± \/=3. 

|z| > \/7 for all other values of z 7^ 0. In particular if \z\ = 2 
then z is a prime. 

(c) Let 

a = 4 = 2-2 = (l+\/=3)(l-\/=3), b = (l+\/^3)-2 = -(l-V^3) 2 . 

Then any d that divides a and 6 divides one of the following 
primes d\ := 1 + \/— 3, <ii = 1 — \/— 3, c?2 := 2. 

(d) Z[v/=3] is not GCD domain. 
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1.3 Bu, Bp and B E domains 

p G D is irreducible (prime) if it is not a unit and every divisor of p is 
improper. A positive integer p G Z is irreducible if and only if p is prime. 
A linear polynomial in D[x 1; ...,x n ] is irreducible. 

Lemma 1.3.1 Let Q, C C be a connected set. Then all irreducible ele- 
ments of H(fl) (up to multiplication by invertible element) are of the form 
z — C for each £ G fl. 

Proof. Let / G H(il) be noninvertiblc. Then there exists ( G O such 
that f(() = 0. Hence z — C\f(z). Therefore the only irreducible elements 
are z — Q. Clearly |5f is analytic in ft if and only if rj = (. □ 

For ( G C has one irreducible element z — (. 

Definition 1.3.2 D is unique factorization domain, or simply UFD 
and denoted by V>u, if any nonzero, noninvertible element a can be factored 
as a product of irreducible elements 

(1.3.1) a = pi---p r , 

and these primes are uniquely determined within order and invertible fac- 
tors. 

Z and H c , (eC are B v . ¥[xi, ...,x n ] is D n [Lan67]. 

Lemma 1.3.3 Let f2 C C be a connected open set. Then H(Q) is not 
unique factorization domain. 

Proof. Theorem 1.2.6 yields the existence of a nonzero function a(z) G 
H(fl) which has a countable infinite number of zeros f2 (which do not 
accumulate in Q). Use Lemma 1.3.1 to deduce that a can not be a product 
of a finite number of irreducible elements. □. 

A straightforward consequence of this lemma that for any open set il C 
C", H(n) is not B v . See Problem 2. 

Definition 1.3.4 D is principal ideal domain, or simply PID and de- 
noted by Dp, if every ideal o/D is principal. 

Z and ¥[z] are D P . It is known that any D P is D v [Lan67] or [vdW59]. 
Thus H(il) is not Dp for any open connected set il C C™. 

Definition 1.3.5 D is a Euclidean domain, or simply ED and denoted 
by De, if there exists a function d : D\{0} — ► Z + such that: 

(1.3.2) for all a, b G D, oh ± d(a) < d(ab); 

for any a, b G D, ab ^ 0, there exists t, r G D such that 

(1.3.3) a = tb+ r, where either r = or d(r) < d(b). 
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We define d(0) = — oo. 

Standard examples of Euclidean domains are Z and F[x], see Problem 

1. 

Lemma 1.3.6 Any ideal {0} =/= I C is principal. 

Proof. Let min xe n/o} d(x) = d{a). Then I is generated by a. □ 

Lemma 1.3.7 Let Q, C C be a compact connected set. Then H(fl) is 
V>E- Here d(a) is the number of zeros of a nonzero function a G H(Q) 
counted with their multiplicities. 

Proof. Let a be a nonzero analytic functions on an open connected set 
O D il. Since each zero of a is an isolated zero of finite multiplicity the 
assumption that f2 is compact yields that a has a finite number of zeros in 
£1. Hence d(a) < oo. Let p a be a nonzero polynomial of degree d(a) such 
that a := ^- docs not vanish on ft. By the definition d(a) = d(p a ) — deg p. 
Let a, b e H(Q), ab ^ 0. Since C[z] is we deduce that 

p a (z) = t(z)p b (z) + r(z), r^Oor d(r) < d(p b ). 

Hence 

a = ^-b + a r, a r = or d(a a r) = d(r) < d(p b ) = d(b). 
Oo 

□ 

The Weierstrass preparation theorem [GuR65] can be used to prove the 
following extension of the above lemma to several complex variables: 

Lemma 1.3.8 Let fl C C™ be a compact connected set. Then H(fl) is 

Let ai,a 2 € D E \{0}. Assume that d(ai) > d(a 2 ). The Euclidean 
algorithm consists of a sequence oi, ak+i which is defined recursively as 
follows: 

(1.3.4) a,i = Ua i+ i + a i+ 2, a i+2 = or d(a i+2 ) < d(a i+ i). 

Since d(a) > the Euclidean algorithm terminates a\ ^ 0, . . . , afe ^ and 
ftfc+i = 0. Hence 

(1.3.5) (ai,a 2 ) = a k . 
Sec Problem 3. 



Problems 
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1. Show that the following domains are Euclidean. 

(a) Z, where d(a) — \a\ for any a e Z. 

(b) F[x], where d(p(x)) = deg p(x) for each nonzero polynomial 

p(x) e F[x]. 

2. Let C C™ be an open set. Construct a nonzero function / depending 
on one variable in £1, which has an infinite number of zeros in fi. 
Prove that / can not be decomposed to a finite product of irreducible 
elements. Hence H(fl) is not Dy 

3. Consider the equation (1.3.3) for r ^ 0. Show that (a,b) — (a,r). 
Using this result prove (1.3.5). 

1.4 Factorizations in 3[x\ 

Let F be the field of quotients of D. Assume that p(x) £ B[x]. Suppose 
that 

p(x) = p 1 (x)p 2 {x), p 1 (x),p 2 (x) e F[x]. 

We discuss the problem when pi(x),p 2 (x) £ D[x]. One has to take in 
account that for any q(x) e F[x] 

(1.4.1) q{x) = ^-, p{x) eD[x], aeB. 
Definition 1.4.1 Let 

(1.4.2) p(x) = a x m H hfl m £D[4 

p(x) is called normalized if ao = 1. £et D &e GC£> domain and denote 
c{p) = (ao, ...,a m ). p(x) is called primitive if c(p) = 1. 

The following result follows from Problem 2. 

Lemma 1.4.2 Let F 6e i/ie quotient field o/Dg. TTien /or any g(x) G 
F[x] £/iere exists a decomposition (1.4.1) where (c(p),a) = 1. The polyno- 
mial p(x) is uniquely determined up to an invertible factor in Dq. Further- 
more, 

(1.4.3) q{x) = b r(x), r(x) e B G [x], a, b e D G , 
w/iere (a, &) = 1 and r(x) is primitive. 
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Lemma 1.4.3 (Gauss's lemma) Let p(x) , q(x) £ V>u[x] be primitive. 
Then p(x)q(x) is primitive. 

The proof of Gauss lemma follows from the following proposition. 

Proposition 1.4.4 Let p,q € OgN- Assume that tt e D is a prime 
element which divides c(pq). Then n divides either c(p) or c(q). 

Proof. Clearly, it is enough to assume that p, q ^ 0. We prove the 
Proposition by induction on k = dcg p + dcg q. For k = p(x) = p , q(x) = 
(jo- Hence c(pq) = poqo- Since n\p q we deduce that tt divides either 
Po = c(p) or q = c(q). 

Assume that the proposition holds for k < I and assume that k = I + 1. 
Let p = Y^hLq a iX l , q — Sj=o bjX^ , where a m b n ^ and I + 1 = m + n. 
So n\p m q m . Without loss of generality we may assume that nontrivial case 
Tr\p m and m > 0. Let r(x) = X^I^Lo 1 aiX ' ' ■ Since ir\c(pq) it is straightforward 
that -Kc(rq). As deg r + deg q < I we deduce that ir\ c(r)c(q). If 7r|c(g) the 
proposition follows. If ir\c(r) then ir\c{p) and the proposition follows in this 
case too. □ 



Corollary 1.4.5 Let p(x) E Djj[x] be primitive. Assume that p(x) is 
irreducible in ¥[x], where F is the quotient field ofDjj. Then p(x) is irre- 
ducible in By [a;]. 

Theorem 1.4.6 Let F be the quotient field o/Dy. Then any p(x) € 
D;/[x] has unique decomposition (up to invertible elements in By): 

(1.4.4) p(x) = aq 1 (x)---q s (x), q u q s G Bu[x], a G By, 

where qi(x), ...,q s (x) are primitive and irreducible in ¥[x] and a has decom- 
position (1.3.1). Hence B;/[x] is UFD. 

See [Lan67] and Problems 3-5. 

Normalization 1.4.7 Let ¥ be a field an assume that p(x) <G F[a;] is 
a nonconstant normalized polynomial in ¥[x]. Let (1.4.4) be a decompo- 
sition to irreducible factors. Normalize the decomposition (1.4.4) by let- 
ting qi(x), q s (x) to be normalized irreducible polynomial in ¥[x]. (Then 
a=l.) 

Lemmas 1.4.3 and 1.4.5 yield (see Problem 5): 
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Theorem 1.4.8 Let p(x) be a normalized nonconstant polynomial in 
Hu[x]. Let (1.4.4) be a normalized decomposition in F[x], where F is the 
quotient field of Djj . Then qi(x), q s {x) are irreducible polynomials in 
Du[x]. 

Theorem 1.4.9 Let il C C™ be a connected set. Assume that p(x) is a 
normalized nonconstant polynomial in H(fl)[x]. Let (1-4.4) be a normalized 
decomposition in A4[x], where M is the field of meromorphic functions in 
ft. Then each qj(x) is an irreducible polynomial in H(Ci)[x]. 

Proof. By the definition of H(£l) we may assume that p(x) e H (CIq) [x] , qj (x) € 
.M(f2o)N, j — 1, s for some open connected flo 3 fl. Let 

(1.4.5) q{x , z)=x t + j2°^xt-r, 

ieC, zeS] , a r (z),/3 r (z) £ H(n ), r = 1, ...,t. 

Then q(x, z) is analytic on f2o\r, where T is an analytic variety given by 

t 

T = {zen : Y[f] r (z) = 0}. 

Let xi(z), x t (z) be the roots of q(x,z) = 0, which is well defined as 
unordered set of functions {x\(z), x t (z)} on Q\T. Suppose that each 
Xk{z) is bounded on some neighborhood O of a point £ € T. Then each 
°p , which is the j symmetric function of {x\(z), ...,x t (z), is bounded on 

O. The Ricmann extension theorem [GrH78] implies that ^|fy is analytic 
in O. If each Xk(z) is bounded in the neighborhood of each ( € T it follows 

that ffM e #("o), k = l,...,t. 

The assumption that p(x, z) is a normalized polynomial in H (Q ) yields 
that all the roots of p(x, z) = are bounded on any compact set S C ft. 
The above arguments show that each qj(x,z) in the decomposition (1.4.4) 
of p(x,z) is an irreducible polynomial in H(fl)[x]. □ 

Problems 

1. ai,...,ak € D\{0} are said to have the least common multiple, 
denoted by lcm(ai, . . . , afc) and abbreviated as the 1cm, if the fol- 
lowing conditions hold. Assume that b e D is divisible by each 
a,- n i = 1, . . . ,k. Then lcm(ai, . . . , afc) |&. (Note that the 1cm is de- 
fined up to an invertible element.) Let D be GCD domain. Show 



12 



CHAPTER 1. DOMAINS, MODULES AND MATRICES 



(a) Icm(ai,02) = 

(b) For k > 2 IcmK ...,«*) = ^fe^fey - 

2. Let F be the division field of D G . Assume that ^ q(x) e ¥[x]. 
Write q(x) = J2iei lt xl wnere a,i,bi € Dg\{0} for each i e /, and 
I = {Q <i\ < . .. < ik} is a finite subset of Z + . Let a! i = ^\) ' ^ = 
( a bi 6 .) for i G I. Then (1.4.1) holds, where a = lcm(a- i , . . . , a' ik ) 
and p(x) = J2iei l^^- Show that (c(p),a) = 1. Furthermore, if 

9(2;) = for some r(x) e Dg[i],c € then c = ea, r(x) = ep(x) 
for some e G D G \{0}. 

3. Let p(x) be given by (1-4.2) and put 

q(x) = b x n H \-b n , r(x) = p(x)q(x) = c x m+ ™ H h c m+n . 

Assume that p(x),q(x) € D{/[x]. Let 7r be an irreducible element in 
V>u such that 

n\a,i, i = 0, ...,a, n\bj, j = 0, . ..,/?, 7r|c Q+/3+2 . 

Then either 7r|a Q +i or 7r|6/3+i. 

4. Prove that if p(x),g(x) G D[/[x] then c(pq) = c(p)c(q). 

Deduce from the above equality Lemma 1.4.3. Also if p(x) and q(x) 
normalized polynomials then p(x)q(x) is primitive. 

5. Prove Theorem 1.4.8. 

6. Using the equality D[xi, x„_i][x„] = D[xi, x n ] prove that Dyfxi, . 
is UFD. Deduce that F[xi, x„ j is UFD. 



1.5 Elementary Divisor Domain 

Definition 1.5.1 V>g is elementary divisor domain, or simply EDD 
and denoted by (D ED ), if for any three elements a,b,c e D there exists 
p,q,x,y el such that 

(1.5.1) (a, b, c) = (px)a + (py)b + (qy)c. 

By letting c = we obtain that (a, b) is a linear combination of a and b. 
Hence an elementary divisor domain is a Bezout domain. 
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Theorem 1.5.2 Let B be a principal ideal domain. Then B is an ele- 
mentary divisor domain. 

Proof. Without loss of generality we may assume that abc ^ 0, (a, b, c) = 
1. Let (a, c) = d. Since D is By ([Lan67]), we decompose a = a'a" , where in 
the prime decomposition (1.3.1) of a, a' contains all the irreducible factors 
of a, which appear in the decomposition of d to irreducible factors. Thus 

(1.5.2) a = a'a", (a', a") = 1, (a', c) = (a, c), (a", c) = 1, 
and if a', / are not coprime then c, / are not coprime. 

Hence there exists q and a such that 

(1.5.3) b-l = -qc+aa". 

Let cf = (a, b+ qc). The above equality implies that (d',a") = 1. Suppose 
that <f is not coprime with a' . Then there exists a noninvertible / such that 
/ divides d' and a! . According to (1.5.2) (/, c) = /' and /' is not invertible. 
Thus f'\ b which implies that /' divides a,c and b. Contradictory to our 
assumption that (a, b, c) = 1. So (d',a') = 1 which implies (d',a) = 1. 
Therefore there exists ijeD such that xa + y(b + qc) = 1. This shows 
(1.5.1) with p= 1. □ 

Theorem 1.5.3 Lei C C k a connected set. Then H(Q) is an ele- 
mentary divisor domain. 

Proof. Given a,b,c e _ff (fi) we may assume that a 7 b,c G H(Q ) for 
some open connected set Slo 3 f2. Theorem 1.2.6 yields 

(1.5.4) (a, 6, c) = (a, (6, c)) = a + 2/(6, c) = a + y(b + qc). 
□ 

Problems 

1. B is called adequate if for any 0/n,ceD (1.5.2) holds. Use the 
proof of Theorem 1.5.2 to show that any adequate B# is V>ed- 

2. Prove that for any connected set !lcC, H(Q) is an adequate domain 
([Hel43]). 

1.6 Modules 

Definition 1.6.1 M is an abelian group if it has a binary operation, 
denoted by +, which satisfies the conditions (1.1.1 — 1.1.5). 
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Definition 1.6.2 Let S be a ring with identity. An abelian group M, 
which has an operation +, is called a (left) S-module if for each r e S, v e 
M the product rv is an element of M such that the following properties 
hold: 

r(vi + v 2 ) = rvi + rv 2 , (n + r 2 )v = nv + r 2 v, 

(1.6.1) 

(rs)v — r(sv), lv = v. 

N C M is called a submodule if N is an S-module. 

Assume that S does not have zero divisors. (I.e. if r, s e S and rs = 
t/ien either r = or s = J T/ien M does n,o£ have zero divisors if 

(1.6.2) rv — i/ and only if v = /or any r^o. 

Assume that H is a domain. Then M is a ca//ed a D-module if in addition 
to the above property M does no£ /lave zero divisors. 

A standard example of .S-module is 

(1.6.3) S m :={v =(«i,. ..,w m ) T : «< € S, i = 1, m}, 
where 

U + V = (til + Ul, u m + v m ) T , 

(1.6.4) 

ru = (rui, ...,™ m ) T , r e 5. 

Note that if S 1 does not have zero divisors then S n is an S-module with no 
zero divisors. 

Definition 1.6.3 A D-module M is finitely generated if there exist n- 
elements (generators) vi, v„ e M such that any v e M is of the form 

n 

(1.6.5) v = ^a i v i , a,i € D, i = 1, n. 

// eac/i v can 6e expressed uniquely in the above form then v 1; ...,v„ is 
called a basis in M, and M is said to have a /mi£e basis. We denote by 
[vi,...,v n ] a basis in M. 

Note that D m has a standard basis = (5^, . . . , <5i«) T , i = i, . . . ,n. 

Let F be a field. Then an F-module is called a vector space V over F. It 
is a standard fact in linear algebra [HJ88] that a finitely generated V has 
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a finite basis. A finitely generated vector space is called finite dimensional. 
The number of vectors of a basis for a finite dimensional vector space V 
is constant. It is called the dimension of V, and is denoted by dim V. A 
submodule of V is called a subspace of V. 

Let M be a D-module with a finite basis. Let F be a quotient ring of D. 
It is possible to imbed M in a vector space V by considering all vectors v 
of the form (1.6.5), where otj G F, i = 1, ...,n. (For more general statement 
see Problem 1.) Thus dim V = n. Using this fact we obtain: 

Lemma 1.6.4 Any two finite bases of a D module contain the same 
number of elements dim V. 

One of the standard examples of submodules in D" is as follows. Con- 
sider the linear homogeneous system 



Then the set of solutions x = (xi, ...,x n ) T is a submodule of D". In §1.12 
we show that the above module has a basis if D is a Bezout domain. 

Definition 1.6.5 Let M be a module over D. Assume that Mj is 
a submodule of M for i = l,...,k. Then M is called a direct sum of 
Mn . . . , Mfe, and denoted as M = ©f =1 Mj, if every element m £ M can 
be expressed in unique way as a sum m = 5Z i=1 m;, where m^ G Mj for 
i = 1, . . . , k. 

Definition 1.6.6 Then ring of quaternions H is a four dimensional 
vector space overR with the basis l,i,j,k, i.e. vectors of the form 

(1.6.7) q = a + bi + cj + dk, a,b,c,deR, 



(1.6.8) i 2 = j 2 = k 2 = -1, ij = -ji = k, jk = -kj = i, ki = -ik = j. 



It is known that H is a noncommutative division algebra over R. See 
Problem 5. 

Problems 

1. Let M be a finite generated module over D. Let F be the quotient 



n 



(1.6.6) 




where 



field of D. Show 
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(a) Assume that M is generated by a 1 ,...,a m . Let N = {x = 
(a^, . . . , x m ) T e D m , £™ 1 x;a; = 0}. Then N is a D-module. 

(b) Let U C F m be the subspace generated by all vectors in N. (Any 
vector in U is a finite linear combination of vectors in N.) Then 
any vector in u G U is of the form ^b, where b G N. 

(c) Let V = F m /U. (V can be constructed as follows. Assume that 
dim U — I. Pick a basis [u 1; . . . ,uj] in U and complete this basis 
to a basis in F m . So [u 1; . . . , u/, w l7 . . . , w m _;] is a basis in F m . 
Let W = span (w 1; . . . , w TO _j). Then any vector in V is of the 
form of a coset w + U for a unique vector w G W.) 



(d) Define <f> : M — > V as follows. Let a G M and write a = 
Yh=i a * a *- Set 0(a) = (oi, ■ ■ ■ , a m ) T + U. Then 



i. <j) is well defined, i.e. does not depend on a particular rep- 
resentation of a as a linear combination of a l7 . . . , a m . 



iii. c/)(aa + 6b) = a<p(a) + b<f)(h) for any a, b G D and a, b G M. 

iv. For any v G V there exists a G D and a G M such that 
4>(a) — av. 

(e) Let Y be a finite dimensional vector space over F with the fol- 
lowing properties. 

i. There is an injection <j> : M — > Y, i.e. </> is one to one, such 
that (f)(am + bn) = a<j)(m) + b<j)(n) for any a, b G D and 
m, n G M. 

ii. For any y G Y there exists a G D and m G M such that 
0(m) = ay. 

Then dim X = dim V, where V is defined in lc. 

Definition 1.6.7 D-module M is called k- dimensional, z/M is 
finitely generated and dim V = k. 

2. Let M be a D-module with a finite basis. Let N be a submodule of 
M. Show that if D is Dp then N has a finite basis. 

3. Let M be a D-module with a finite basis. Assume that N is a finitely 
generated submodule of M. Show that if D is Dp then N has a finite 



4. Let M be a module over D. Assume that Mj is a submodule of M 
for i = 1, . . . , k. Then N := M r + . . . + M k is the set of all m of the 
form m 1 + . . . + m^, where G Mi for i = 1, . . . , k. Show 




basis. 
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(a) N is a submodulc of M. 

(b) n 4 fc =1 Ni is a submodule of M. 

(c) Assume that M M& are finitely generated. Then N is 
finitely generated and dim N < ^ j=1 dim Mj. 

(d) Assume that M t , . . . , M. k are have bases and N = ©fL 1 Mj. 
Then N has a basis and dim Nl = J2i=i dim Mj. 

5. Show 

(a) H can be viewed as C 2 , where each q e HI of the form (1.6.7) 
can be written as q = z + wj, where z = a + bi, w = c + di € C. 
Furthermore, for any z G C, jz = zj. 

(b) H is a ring with the identity 1 = 1 + Oi + Oj + Ok. 

(c) (rq)s = q(rs) for any q, s e H and ret. Hence H is an algebra 
over R. 

(d) Denote |q| = \J a 2 + b 2 + c 2 + d 2 , q = a — bi — cj — dk for any q 
of the form (1.6.7). Then qq = qq = |q| 2 . Hence |q|~ 2 q is the 
right and the left inverse of q ^ o. 

1.7 Algebraically closed fields 

Definition 1.7.1 A field F is algebraically closed if any polynomial 
p(x) G ¥[x] of the form (1.4.2) splits to linear factors in ¥: 

m 

(1.7.1) p(x) = a ]J(.T-&), & e F, i = l,...,m, a ^ 0. 

i=i 

The classical example of an algebraically closed field is the field of complex 
numbers C. The field of real numbers R is not algebraically closed. 

Definition 1.7.2 Let K D F be fields. Then K is an extension field of 
F. K is called a finite extension of ¥ if IK is a finite dimensional vector 
space over ¥. The dimension of the vector space K over ¥ is called the 
degree of K and is denoted by [K : F] . 

Thus C is a finite extension of K of degree 2. It is known [Lan67], see 
Problems 1-2: 

Theorem 1.7.3 Let p(x) e F[x]. Then there exists a finite extension 
K of¥ such that p(x) splits into linear factors in K[x]. 
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The classical Weierstrass preparation theorem in two complex variables 
is an explicit example of the above theorem. We state the Weierstrass 
preparation theorem in a form needed later [GuR65]. 

Theorem 1.7.4 Let H be the ring of analytic functions in one variable 
in the neighborhood of the origin € C. Let p(X) G H [A] be a normalized 
polynomial of degree n 

n 

(1.7.2) p{X,z) = X n + ^2a j (z)X n -', aj{z) e H , j = 1, n. 

3 = 1 

Then there exists a positive integer s\n\ such that 

n 

(1.7.3) P (X,w s ) = H(X-Xj(w)), Xj(w) e H , j = 1, ...,n. 

3 = 1 

In this particular case the extension field K of F = Aio is the set of multi- 
valued functions in z. which are analytic in z» in the neighborhood of the 
origin. Thus K = Mo(w), where 

(1.7.4) w s = z. 
The degree of K over F is s. 

Problems 

1. Let F be a field and assume that p(x) — x d + adx d ~ x + . . . + a\ e 
¥[x], where d > 1. On the vector field ¥ d define a product as fol- 
lows. (bi,...,bd)(ci,...,Cd) = (ri,...,r d ), where (n,...,r d ) de- 
fined as follows. Let b(x) = J2i=i hx 1 " 1 , c(x) — JZ i=1 qx 4-1 , r(x) = 
Si=i fix 1-1 . Then r(x) is the remainder of b(x)c{x) be the division 
by p(x). I.e. b(x)c(x) = g{x)p{x) + r{x) where deg r(x) < d. Let Vd 
be ¥ d with the above product. 

Show 

(a) Vd is a commutative ring with identity d = (l, o, . . . , o). 

(b) F is isomorphic to span (ej, where / fe x . 

(c) Let e; = (Su, Sdi),i = 2, . . . , d. Then 

e l 2 = e 1+i ,i = o, . . . ,d- 1, p(e 2 ) = 0. 

(d) is a domain if and only if p(x) is an irreducible polynomial 
over F[x]. 
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(e) Vd is a field if and only if p(x) is an irreducible polynomial over 
¥[x). 

(f) Assume that p(x) € ¥[x] is irreducible. Then K := Vd is an 
extension field of F with [K : F] — d. Furthermore p(x) viewed 
as p(x) € K[x] decompose to p(x) = (x — e 2 )q(x), where q(x) — 

nS d - 1 + T,ti9iX i - 1 €K[x]. 

2. Let F be a field and p(x) G F[x]. Show that there exists a finite 
extension field K such that p(x) splits in K. Furthermore [K : F] < 
(deg p)\ 

1.8 The resultant and the discriminant 

Let D be an integral domain. Suppose that 

(1.8.1) p(x) = a x m + ■ ■ ■ + a m , q(x) = b x n + ■■■ + &„€ D[x}. 

Assume furthermore that m, n > 1 and aobo ^ 0. Let F be the quotient 
field D and assume that K is a finite extension of F such that p(x) and q(x) 
split to linear factors in K. That is 

m 

p(x) = a Q JJ(a; - (,eK, i = 1, m, a ^ 0. 

i=l 

(1.8.2) 

n 

<l( x ) = b o Y[(x - Vj), Vj e K, j = 1, ...,n, b ^ 0. 

J'=l 

Then the resultant R(p, q) of p, q and the discriminant D(p) of p are defined 
as follows. 

m,n 
i >.?'=1 

(1.8.3) 

l<i<j<m 

It is a classical result that R(p,q) € D[ao, a m , bo, 6„] and D{p) £ 
D[a , ...,a m ], e.g. [vdW59]. More precisely, we have. 
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Theorem 1.8.1 Let 

a = (oo, • • • , O e D m+1 ,b = (6 , • • • , b n ) e D n+1 , 

m n 

P (x) ^a^- 1 ,^ = 

Then R(p,q) = det C(a, b), where 



C(a,b) 



a a x a 2 
ag ai 







a m 




60 h ... b n - 2 
b Q h 



a 

&n-l 
b n -2 



a-i a 2 
b n 
b n -i b n 



h b 2 ... b n 



... 6 

is an (to + n) x (to + n) matrix. 

Proof. Let F be the quotient field of D, and assume that p,q € 



split in a finite extension field K of F. Let c(x) = E"=o 



,<i(x) 



E^To 1 dj*" 1 " 1 " 3 ' e F[.t]. Thcnc(x)p(x)+d(x)q(x) = ES" 1 S^"- 1 ^. 
Denote 

f = (c ,...,C n _ 1 ,d ,---,dm-i),g = (jor-',Jm+n-i) e D m +™. A 

straightforward calculation show that fC(a,b) = g. 

Assume that detC(a, b) 7^ 0. Let f = (0, . . . , 0, i)C(a, b) _1 . Hence 
there exists c(x),d(x) <G F[x] of the above form such that c(x)p(x) + 
d(x)q(x) = 1. Hence p, q do not have common zeros in K. 

We now show that if ao&o 7^ then R(p,q) = det C(a, b). Divide the 
first n rows of C(a,b) by a and the last m rows of C(a, b) by 60, to 
deduce that it is enough to show the equality R(p, q) = det C(a, b) in the 
case o = 6 = 1. Then p(x) = n"ii( x _ u i),<l( x ) = IYj=i( x ~ v j) e K N- 
Recall that (— and (— the z — t/i and j — th elementary symmetric 
polynomials in u\, . . . , u m and v\, . . . , v n , respectively: 

(1.8.4) ai = (-l) J ^2 u h ...u h , i = l,...,m, 

l<h<. ..<li<m 

bj = (- 1 Y v h ...vi v j = l,...,n. 

l<;i<...<ij<n 



1.8. THE RESULTANT AND THE DISCRIMINANT 



21 



Then C(a, b) is a matrix with polynomial entries in u = (u 1 , . . . , u m ),v = 
(«!,..., t) n ). Hence s(u, v) := det C(a,b) is a polynomial in m+n variables. 

Assume that m — Vj for some i € [l,m],j € [1,^]- Then p(x) and q{x) 
have a common factor x — Ui = x — Vj . The above arguments shows that 
s(u, v) = o. Hence s(u, v) is divisible by t(u, v) = n"=Tj=i( u « — So 
s(u, v) = /i(u,v)t(u, v), for some polynomial ft.(u,v). 

Consider s(u, v),t(u, v), ft(u, v) as polynomials in v with coefficients 
in D[u]. Then deg v i(u,v) = nm and the term of the highest degree is 
(—l) mn v™ ■ ■ ■ v™. Observe next that the contribution of the variables v in 
det C(a, b) comes from it last m rows. The term of the maximal degree in 
each such row is n which comes only from b n = (—l) n vi . . . v n . Hence the 
coefficient of the product 6™ comes from the minor of C (a, b) based on the 
first n rows and columns. Clearly this minor is equal to aft = 1. So h(u, v) 
is only polynomial in u. Furthermore h(u) = 1. 

□ 

If F is a field of characteristic then 
(1.8.5) D(p)=±a^R(p,p'). 

Note that if dj, bi are given the weight i for i = 0, then R{p, q) and D{p) 
are polynomials with total degrees ran and m(ra — 1) respectively. See 
Problem 4. 

Problems 

1. Let D be a domain and assume that p(x) — x m , q(x) = (x + l) n . Show 

(a) R(p,q) = l. 

(b) Let a = (1,0,..., 0) e D m+1 ,b = (Q, (?),•••, (")) G D" +1 - 
Let C(a, b) be defined as in Theorem 1.8.1. Then det C(a, b) = 
1. 

2. Let u = (ui, ■ • ■ ,u m ), v = (ui,...,u„). Assume that each otj € 
D[u],6j € D[v], is a multilinear polynomial for i = 0, . . . ,m,j = 

0. . . . , n. (The degree of o^, 6^ with respect to any variable is at most 

1. ) Let C(a, b) be defined as in Theorem 1.8.1. Show that det C(a, b) 
is a polynomial of degree at most n and m with respect to u% and Vj 
respectively, for any i — 1, . . . , m and j = 1, . . . , n. 

3. Let the assumptions of Theorem 1.8.1 hold. Show 

(a) If a = 6 then det C(a, b) = 0. 
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(b) Assume that p(x) is not a zero polynomial and ao = 0, bo ^ 0. 
Then det C(a, b) = if and only if p and g have a common root 
in an extension field K of F, where p and q split. 

4. Let C(a, b) be defined as Theorem 1.8.1. View det C(a, b) as a poly- 
nomial F(a,b). Assume that the weight u)(di) = i,u)(bj) = j. Then 
the weight of a monomial in the variables a, b is the sum of the 
weights of each variable times the number of times in appears in this 
monomial. Show 

(a) Each nontrivial monomial in .F(a, b) is of weight mn. 

(b) Assume as in the proof of Theorem 1.8.1 that a = b = 1 and a, 
and bj are the i—th and j—th elementary symmetric polynomials 
in u and v respectively. Then each nontrivial monomial in u, v 
appearing in F(a(u),b(v)) is of total degree mn. 



In 1.2 we pointed out that ¥[x\, x n ] is not B# for n > 2. It is known 
[Lan67] that ¥[xi, x n ] is Noethcrian: 

Definition 1.9.1 D is Noetherian, denoted by H>n, if any ideal o/D is 
finitely generated. 

In what follows we assume that F is algebraically closed. Let pi, ...,pk € 
F[xi, x n ]. Denote by U(pi, ...,Pk) the common set of zeros of pi, ...,Ph- 

(1.9.1) U(px, ...,p k ) = {x = ( Xl , ...,x n ) T : pj(x) = 0, j = l,...,k}. 

U(pi, ...,Pk) may be an empty set. U(pi, ...,Pk) is called an algebraic variety 
(in F"). It is known [Lan67] that any nonempty variety in F n splits as 



where each Vi is an irreducible algebraic variety, which is not contained in 
any other Vj . Over C each irreducible variety V C C™ is a closed connected 
set. Furthermore, there exists a strict subvariety W C V (of singular points 
of V) such that is a connected analytic manifold of complex dimension 

d in C™. dim V := d is called the dimension of V. If d — then V consists 
of one point. For any set U C F n let I(U) be the ideal of polynomials 
vanishing on U : 



1.9 The ring ¥[x\, x n ] 



(1.9.2) 



(1.9.3) 



IOJ) = {pe ¥[xi, x n ] : p(x) = 0, Vx e U}. 
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Theorem 1.9.2 (Hilbcrt Nullstcllcnsatz) Let F be an algebraically closed 
field. Let I C ¥[x\, ...,x n ] be an ideal generated by p\, ...,pk- Assume that 
g G F[a;i, x n ]. Then gi G / for some positive integer j if and only if 
g € I(U( Pl ,..., Pk )). 

Corollary 1.9.3 Let pi, ...,Pk G ¥[x\, ...,x n ], where F is algebraically 
closed field. Then pi, ...,Pk generate F[xi, x n ] ifandonlyifU(pi,...,pk) = 
0. 

1.10 Matrices and homomorphisms 

Notation 1.10.1 For a set S denote by S mxn the set of all m x n 

matrices A — [aij]lZ™ l J[~ n , where each aij G S. 

Definition 1.10.2 Let M, N be B-modules. Let T : N -> M. T is a 

homomorphism if 

(1.10.1) T(au+ 6v) = aTu + bTv, for all u, v e N, a, b e D. 

Range T = {u e M : u = TV, v e N}, 
Kcr T = {v e N : Tv = 0}, 

&e the range and the kernel of T. Denote by Hom(N, M) the set of all 
homomorphisms o/N to M. 

T e Hom(N, M) is an isomorphism if there exists Q G Hom(M, N) 
smc/i i/iai QT and TQ are the identity maps on M and N respectively. M 
and N are isomorphic if there exists an isomorphism T G Hom(N, M). 

Hom(N,M) is a D-module with 

(aS + bT)v = aSv + bTv, a,beD, S,T G Hom(N, M), v G N. 

Assume that M and N have finite bases. Let [u l7 u m ] and [v 1; v„] 
be bases in M and N respectively. Then there exists a natural isomorphism 
between Hom(N,M) and D mx ™. For each T G Hom(N,M) let A = [a l3 ] G 
D mx " be defined as follows: 

m 

(1.10.2) Tvj-^ayUi, j = l,...,n. 

i=l 

Conversely, for each A = [aij] G D mxn there exists a unique T G Hom(N, M) 
which satisfies (1.10.2). The matrix A is called the representation matrix 
of T in the bases [m, u m ] and [vi, v„]. 
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Notation 1.10.3 For a positive integer n denote [n] — {1, . . . ,n}. For 
k E [n] denote by [n]k the set of all subsets of [n] of cardinality k. Each a E 
[n] k is represented by (a\, . . . , a k ), where a\, . . . , a k are integers satisfying 
\ < a\ < . . . < ot k < n. 

Definition 1.10.4 Let D be a domain and A E D mx ". Assume that 
a= (ai,...,a k ) € [m] k ,/3 = (Pi, ■ ■ ■ , Pi) E [n]j. Denote by A[a,j3] = 
[ a uipj]i'j=i the k x I submatrix of A. For k = I, dct A[a,/3] is called an 
(a, 0) minor, k-minor, or simply a minor of A. The rank of A, denoted by 
rank A, is the maximal size of a nonvanishing minor of A. (The rank of 
the zero matrix is 0.) The nullity of A, denoted by mil A, is n — rank A. 

Any A E D mx " can be viewed as T E Hom(D™, D m ), where Tx := Ax, x = 
(xi, ...,x n ) T . We will sometime denote T by A. If D is H>b then Range A 
has a finite basis of dimension rank A (Problem 1). 

We now study the relations between the representation matrices of a 
fixed T E Hom(N, M) with respect to different bases in M and N. 

Definition 1.10.5 U E D" x " is called invertible (unimodular) i/det U 
is an invertible element in D. 

Proposition 1.10.6 U E D nxn is invertible if and only if there exists 
V E D nx ™ such that either UV or VU is equal to the identity matrix I. 

Proof. Let F be the divison field of D. Assume first that det U is an 
invertible clement in D. Then U is an invertible matrix in F" x ™, where 

(1.10.3) = ^ adj U, 

(1.10.4) adj A = [(-ir+Met A[[n] \ {j}, [n] \ 

Clearly V := U^ 1 E D" x ". Assume now that there exists V E D™ x " such 
that VU = I. Then 1 = dct VU = det Vdet U and det IT 1 = dct V E D. 
Similarly det U is invertible in D if UV = I. □ 



Notation 1.10.7 Denote by GL(n,D) the group of invertible matrices 
in W ixn . 

Lemma 1.10.8 Let M be a D-module with a finite basis [iii,...,u m ]. 
Then [u l7 u m ] is a basis in M if and only if the matrix Q = [q^i] E D mxm 
given by the equalities 

m 

(1.10.5) Ui = ^q kl u k , i=l,...,m, 

k =i 



is an invertible matrix. 
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Proof. Suppose first that [ui, ...,u m ] is a basis in M. Then 



rn 



(1.10.6) Ufc = yVjfcUj, k = l,...,m. 



i=i 



Let R = [rikXi - Insert (1.10.6) to (1.10.5) and use the assumption that 
[ui, u m ] is a basis to obtain that RQ = I. Proposition 1.10.6 yields that 
Q e GL(m,B). Assume now that Q is invertiblc. Let R = Hence 
(1.10.6) holds. It is straightforward to deduce that [iii, u m ] is a basis in 



Definition 1.10.9 Let A,B € B mx ™. Then A and B are right equiv- 
alent, left equivalent and equivalent if the following conditions hold respec- 



(1.10.7) 5 = AP for some P e GL(n,D) (A ~ r B), 

(1.10.8) 5 = QA for some Q e GL(m, D) (A~ ; P), 

(1.10.9) 5 = QAP for some P e GL(n,B), Q e GL(m,B) (A~B). 

Clearly, all the above relations are equivalence relations. 

Theorem 1.10.10 Let M and N be H-modules with finite bases having 
m and n elements respectively. Then A,B E D mx " represent some T E 
Hom(N, M) in certain bases as follows: 

(1) A ~j B if and only if A and B represent T in the corresponding bases 
of U and V respectively 

(1.10.10) [ui,...,u m ], [vi,...,v„] and [ui, ...,u m ], [vi, v n ]. 

(r) A^ r B if and only if A and B represent T in the corresponding bases 
of U and V respectively 

(1.10.11) [ui,...,u m ], [vi,...,v n ] and [ui,...,u m ], [vi, v n ]. 

(e) A ~ B if and only if A and B represent T in the corresponding bases 
of U and V respectively 

(1.10.12) [ u i,...,u m ], [vi,...,v n ] and [ui,...,u m ], [vi,...,v n ]. 

Sketch of a proof. Let A be the representation matrix of T in the bases 
[ui, ...,u m ] and [vi, ...,v„] given in (1.10.2). Assume that the relation be- 
tween the bases [u l7 ...,u m ] and [iii, ...,u m ] is given by (1.10.5). Then 



M. 



□ 



tively: 



m 



rn 




i=l 



i=k=l 
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Hence the representation matrix B in bases [ui,...,u m ] and [vi,...,v„] is 
given by (1.10.8). 

Change the basis [v r , v„] to [v 1; v„] according to 

n 

Vj = ^Pijvi, j = l,...,n, P= \pij] g GL(n, D). 

Then a similar computation shows that T is presented in the bases [ui, u m ] 
and [vi, v n ] by AP. Combine the above results to deduce that the rep- 
resentation matrix B of T in bases [ui, ...,u m ] and [vi, v n ] is given by 
(1.10.9). □ 



Problems 

1. Let A g D^ x ". View A a as linear transformation from A : Wg — > 
D^ - . to show that Range A is a module with basis of dimension 
rank A. (Hint: Use Problem 1.6.3.) 

2. For A,B e D mxn show. 

(a) If A <~/ B then Ker ^4 = Ker B and Range A and Range B are 
isomorphic. 

(b) A ^ r B then Range ^4 = Range B and Ker A and Ker £? are 
isomorphic. 



1.11 Hermite normal form 

We start this section with two motivating problems. 

Problem 1.11.1 Given A,B g D mxn . Mien are A and B 
(1) left equivalent; 
(r) rig/it equivalent; 
(e) equivalent. 

Problem 1.11.2 For a given A g D mXTl characterize the equivalence 
classes corresponding to the left equivalence, to the right equivalence and to 
the equivalence relation as defined in Problem 1.11.1. 

For V>g the equivalence relation has the following natural invariants: 



1.11. HERMITE NORMAL FORM 



27 



Lemma 1.11.3 For A G D™ xn let 

fj,(a, A) := g.c.d. ({det A[a, 9], 9 e [n] k }, a G [m] fe ), 

(1.11.1) A) := g.c.d. ({det A[^,/3], e [m] fe », /? e [n] k , 
S k (A) := g.c.d. ({det A[<t>,0\, <P G [m] fc) G [n] fc », 

(<5fc (^4) is called the k-th determinant invariant of A.) Then 

(J>(a, A) = /i(a, B) for all a G [m] k if A ^ r B, 

(1.11.2) v{!3,A) = v{I3 : B) for all /3 G [n] fe if A ~j B, 
<y fc (A) = 5 fc (S) if^~B, 

for k = 1, ...,min(m, n). (Recall that for a,b G H a = b if a = be for some 
invertible c G D.j 

Proof. Suppose that (1.10.7) holds. Then the Cauchy-Binet formula 
(e.g. [Gan59]) implies 

det B[a,i\= ^ det A i a > °} det p [< 9 > ^ ■ 

ee[n] k 

Hence fj,(a,A) divides n(a,B). As A = BP -1 we get fj,(a,B)\n(a,A). 
Thus fj,(a,A) = (j,(a,B). The other equalities in (1.11.2) are established in 
a similar way. □ 

Clearly 

(1.11.3) ^ A T ~ r £ T , A,BeB mxn . 

Hence it is enough to consider the left equivalence relation. We characterize 
the left equivalence classes for Bczout domains D B . To do that we need a 
few notations. 

Recall that P G D™ x " is called a permutation matrix if P is a matrix 
having at each row and each column one nonzero clement which is equal to 
the identity clement 1. A permutation matrix is invertible since P^ 1 = P T . 

Definition 1.11.4 Let II„ C GL(n,D) be the group of n x n permuta- 
tion matrices. 

Definition 1.11.5 An invertible matrix U G GL(n, D) is called simple 
if there exists P,Q G n„ such that 



(1.11.4) U = P 



V 

I n -2 



Q, 
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where 



(1.11.5) 



V = 



a (3 
7 8 



£ GL(2, D), (ad - (3-y is invertible). 



U is called elementary ifU is of the form (1.11.4) and 



(1.11.6) 



V = 



a 13 
5 



£ GL(2,B), and a, S are invertible. 



Definition 1.11.6 Let A £ D mx ". The following row (column) opera- 
tions are called elementary: 

(a) interchange any two rows (columns) of A; 

(b) multiply row (column) i by an invertible element a; 

(c) add to row (column) j b times row (column) i (i ^ j). 
The following row (column) operation is called simple: 

(d) replace row (column) i by a times row (column) i plus b times row 
(column) j, 

and row (column) j by c times row (column) i plus d times row (column) j, 
where i ^ j and ad — be is invertible in D. 

It is straightforward to see that the elementary row (column) operations 
can be carried out by multiplication of A by a suitable elementary matrix 
from left (right), and the simple row (column) operations are carried out 
by multiplication of A by a simple matrix U from (left) right. 

Theorem 1.11.7 LetO B be a Bezout domain. Let A £ D^ xrl . Assume 
that rank A = r. Then there exists B = [bij] £ Wg Xn which is equivalent to 
A and satisfies the following conditions: 

(1.11.7) i — th row of B is a nonzero row if and only if i < r. 
Let bi Ui be the first nonzero entry in i-th row for i = l, ...,r. Then 

(1.11.8) 1 < n x < n 2 < ■ ■ ■ < n r < n. 

The numbers m, ...,n r are uniquely determined and the elements bi ni , i = 
l,...,r, which are called pivots, are uniquely determined, up to invertible 
factors, by the conditions 



(1.11.9) 



v{{n 1 ,...,n i ),A) =b lni ■■■b ini , i= l,...,r, 
v(a, A) = 0, a £ Qi,„ ( _i, i = l,...,r. 



1.11. HERMITE NORMAL FORM 



29 



For 1 < j < i < r, adding to the row j a multiple of the row i does not 
change the above form of B. Assume that B — [bij],C — [cy] € D mx ™ are 
left equivalent to A and satisfy the above conditions. If bj n . — Cj ni , j = 
= 1, ...,r then B = C. The invertible matrix Q which satisfies 
(1.10.8) can be given by a finite product of simple matrices. 

Proof. Clearly, it is enough to consider the case A^ 0, i.e. r > 1. Our 
proof is by induction on n and to. For n = m = 1 the theorem is obvious. 
Let n = 1 and assume that for a given m > 1 there exists a matrix Q, 
which is a finite product of simple matrices, such that the entries (i, 1) of 
Q are zero for i = 2, m if to > 2. Let A\ = [an] € 
by ^4 the submatrix [a,i]£Li- Set 



{m+i)xi denote 



Qi ■-- 



(2) 

Then the (i, 1) entries of A 2 = [a y a '] = Q\Ai are equal to zero for i = 
2,..., to. Interchange the second and the last row of A 2 to obtain A3. 
Clearly A 3 = [a^'] = Q2A2 for some permutation matrix Q 2 . Let A 4 = 
(fflii\<4tt) T - As is Bezout domain there exists a,/3 G D B such that 

(1.11.10) aa$ + f3a { ^ = (a$,a$) = d. 
As (a, p) = 1 there exists 7, <5 € O s such that 

(1.11.11) a5-/?7=l. 

Let V be a 2 x 2 invertible matrix given by (1.11.5). Then 



VA4 



Lemma 1.11.3 implies f((l), A 5 ) = j/((1), At) = d. Hence a" = for some 
p e D B . Thus 



Let 



= WA 5 , W = 



Q 3 = 



1 

-v 1 



G GL(2,D B ). 



'w 




V 


. J ro _i_ 




.0 / ro _i_ 



Then the last to rows of A 6 = [a^] = Q^A 3 are zero rows. So 
f((l), Aa) = v((l),Ai) and the theorem is proved in this case. 
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Assume now that we proved the theorem for all A\ G D^ x ™ where 

n < p. Let n = p + 1 and A e D™ x(p+1) . Let A x = [ay]™ p j=1 . The 
induction hypothesis implies the existence of Q\ G GL(m,D B ), which is 
a finite product of simple matrices, such that B[ — [b^]™fj =1 — Q\A\ 
satisfies the assumptions of our theorem. Let n' 1 ,...,n' s be the integers 



defined by A x . Let B 1 = [fc^]™}^ = Q X A. If b£ = for i > s then 
m = n' i , i = 1, s and B\ is in the right form. Suppose now that b\^ 
for some s < i < m. Let B2 — [b^]iL s +i € s ' xl . We proved above 
that there exists Q 2 € GL(m— s, D s ) such that Q2-B2 = (c, 0, 0) T , c 7^ 0. 
Then 



I, 
Q 2 



B 3 

is in the right form with 

s = r — 1, ni = n' 1; n r _i = n^._ 1; n r = n. 

We now show (1.11.9). First if a € [n» — l]i then any matrix S[/3|a], /3 e 
[m]j has at least one zero row. Hence det S[/3|a] = 0. Therefore v(a, B) = 
0. Lemma 1.11.3 yields that v(a,A) = 0. Let a = {n\, ...,rii). Then 
£?[/?| a], f3 £ [m]i. Then B[/3|a] has at least one zero row unless [5 is equal 
to 7 = (1, 2, Therefore 

u(a, A) = v{a, B) = det B[y\a] = b lni ■ ■ ■ b ini ^ 0. 

This establishes (1.11.9). 

It is obvious that bi ni , b rrir are determined up to invertible elements. 
For 1 < j < i < r we can perform the following elementary row operation 
on B: add to row j a multiple of row i. The new matrix C will satisfy the 
assumption of the theorem. It is left to show that if B = [bij], C = [cy] € 
D^ lXTl are left equivalent to A, have the same form given by the theorem 
and satisfying 

(1.11.12) b jni =c jni , j = i = l,...,r, 

then B — C. See Problem 1. □ 



A matrix B e D^ x " is said to to be in a Hermite normal form, abbre- 
viated as HNF, if it satisfies conditions (1.11.7-1.11.8). 

Normalization 1.11.8 Let B = [bij] € D^ ixn in a Hermite normal 
form. Ifb in . is invertible we set b ini = 1 and bj n . = for i < j. 
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Theorem 1.11.9 Let U be an invertible matrix over a Bezout domain. 
Then U is a finite product of simple matrices. 

Proof. Since det U is invertible, Theorem 1.11.7 yields that bu is in- 
vertible. Normalization 1.11.8 implies that the Hermite normal form of U is 
/. Hence the inverse of U is a finite product of simple matrices. Therefore 
U itself is a finite product of simple matrices. □ 



Normalization 1.11.10 For Euclidean domains assume 

(1.11.13) either bj ni — or d(bj ni ) < d(bi ni ) for j < i. 

For 7L we assume that b ini > 1 and < b, jn . < b in . for j < i. For ¥[x] we 
assume that bi ni is a normalized polynomial. 

Corollary 1.11.11 Let B E = Z,¥[x\. Under Normalization 1.11.10 



any A e 



has a unique Hermite normal form. 



It is a well known fact that over Euclidean domains Hermite normal form 
can be achieved by performing elementary row operations. 

Theorem 1.11.12 Let A E D™ x ". Then B = QA, Q e GL{m,B E ) 
where B is in a Hermite normal form satisfying Normalization 1.11.10 and 
Q is a product of finite elementary matrices. 

Proof. From the proof of Theorem 1.11.7 it follows that it is enough to 
show that any A e GL(2,Os) is a finite product of elementary invertible 
matrices in GL(2,De). As I2 is the Hermite normal form of any 2x2 
invertible matrix, it suffices to show that any A £ B> 2 E x2 can be brought to 
its Hermite form by a finite number of elementary row operations. Let 



A,= 



CLi bi 

a i+i h+i 



A x = PA, 



where P is a permutation matrix such that d(a\) > ^(02). Suppose first 
that a 2 7^ 0. Compute a i+2 by (1.3.4). Then 



A, 



+1 



"0 


1 — 1 




'1 -u 


1 — 1 







1 



Ai, 



1,... 



As the Euclidean algorithm terminates after a finite number of steps we 
obtain that ak+i = 0. Then Ak is the Hermite normal form of A. If 
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bk+i = we are done. If b k +i ^ substract from the first row of Ak 
corresponding multiple of the second row of Ak to obtain the matrix 



B' 



a k b' k 
b k+ i 



, d(b k+1 )>d(b' k ). 



Multiply each row of B' by invcrtiblc clement if necessary to obtain Hcr- 
mite's form of B according to Normalization 1.11.10. We obtained B by 
a finite number of elementary row operations. If a\ = a 2 = perform the 
Euclid algorithm on the second column of A. □ 



Corollary 1.11.13 Let U G GL(n,D£). Then U is a finite product of 
elementary invertible matrices. 

Corollary 1.11.14 Let ¥ be afield. Then A £ F mx " can be brought to 
its unique reduced row echelon form given by Theorem 1.11.7 with 

bim = 1, b jni =0, j= 1, i — 1, i= l,...,r, 

by a finite number of elementary row operations. 



Problems 

1. Show 

(a) Let A,B £ D mxm be two upper triangular matrices with the 
same nonzero diagonal entries. Assume that QA = B for some 
Q g D mxra . Then Q is un upper triangular matrix with one on 
the main diagonal. (Hint: First prove this claim for the quotient 
field F of D.) 

(b) Let Q € D mxm be an upper triangular matrix with 1 on the 
main diagonal. Show that Q = i?2 • • • R m = T m . . . T 2 where 
Ri — Im,Qi — Im m 3,y have nonzero entries only in the places 
(j, i) for j = 1, . . . , i — 1 and i = 2, . . . , m. 

(c) Let A,B e D mx ". Assume that A ~; B, and A and B are in 
HNF and have the same pivots. Then B can be obtained from 
A, by adding multiples of the row 6, ni to the rows j = 1, . . . , i — 1 
for i = 2, . . . , r. 

(d) Let B = [bij],C = [Cij] € D mx ™. Assume that B,C arc in 
Hermitc's normal form with the same r numbers 1 < n\ < ■ ■ ■ < 
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n r < n. Suppose furthermore (1.11.12) holds and B = QC for 
some Q G D mxm . Then 



Q = 



i r * 
o * 



B = C. 



(Here * denotes a matrix of a corresponding size.) 

(e) Let M be a D B module, N = D™ and T G Horn (N,M). Let 
Range (T) be the range of T in M. Then the module Range (T) 
has a basis Tui, ...,Tuk such that 

i 

(1.11.14) u i = ^2c ij v j , c u ^0, i = l,...,k, 

i=i 

where v l7 v„ is a permutation of the standard basis 

(1.11.15) e* = {S il ,...,S in ) T , i=i,...,n. 

2. Let A G D^ IX ™ and assume that £? is it's Hermite's normal form. 
Assume that rn < j < n i+ i. Prove that 

v(a,A) = b lni ■ ■ ■6( i _i)„._ 1 6 i j, for a = (m, ...,n;_i,j). 

3. Definition 1.11.15 Le£ F &e a /ie/d and V a vector space over F 
o/ dimension n. A flag on V is a strictly increasing sequence of 
subspaces 



(1.11.16) 



= F o cF 1 C-CF„=V, 
dim Fi = i, i = l, n = dim V. 



Show 

(a) Let L be a subspace of V of dimension I. Then 

(1.11.17) dimLnFj-! <dimLnF; < dim LnF^ + i, i= i,...,n. 

(b) Let Gr(£, V) be the space of all ^-dimensional subspaces of V. 
Let J = {1 < j\ < ■ ■ ■ < je < n} be a subset of < n > of cardinality 
t= \ J\. Then' 

n°(J,F») := {L G Gr(f, V) : dim L n Fj 4 = i, i=i, ...,£}, 
(1.11.18) 

n(J,F») := {L G Gr(^V) : dim L n Fj. > i, i =!,...,£}, 
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which are called the open and the closed Schubert cell in Gr(£, V) 
respectively. Show that a given L e Gr(^, V) belongs to the smallest 
open Grassmanian cell f2}, where J = J(L, F + ) given by the condition 

(1.11.19) dim L n Fj. = i, dim L n Fj t -i =i — 1, i=i,...,£. 

(c) Let V = F™ and assume that e 1; .., e„ is the standard basis of F". 
Let 

(1.11.20) Fi = span (e„,e n _ 1 ,...,e n _ i+1 ), i=i,...,n 

be the reversed standard flag in F™. Let A e f mxn . Assume that 
t = rank A > 1. Let L e Gv(£,F n ) be the vector space spanned by 
the columns of A T . Let N — {1 < n\ < ■ ■ ■ < ni < n} be the integers 
given by the row echelon form of A. Then J(L, F*) = N. 

1.12 Systems of linear equations over Bezout 
domains 

Consider a system of m linear equations in n unknowns: 

n 

^ ^ ^ij % j — b{ ; i — 1 ) • • • ) ^ 7 

(1.12.1) 

a,ij,bi el, i = l, m, j = 1, 
In matrix notation (1.12.1) is equivalent to 

(1.12.2) Ax = b, iel mx ",xeD",bel m . 
Let 

(1.12.3) i=[i,b]el mx < n+1 ). 

The matrix A is called the coefficient matrix and the matrix A is called the 
augmented coefficient matrix. If D is a field, the classical Kronecker-Capelli 
theorem states [Gan59] that (1.12.1) is solvable if and only if 

(1.12.4) rank A = rank A. 

Let F be the quotient field of D. If (1.12.1) is solvable over D it is also solv- 
able over F. Therefore (1.12.4) is a necessary condition for the solvability 
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of (1.12.1) over D. Clearly, even in the case m = n = 1 this condition is 
not sufficient. In this section we give necessary and sufficient conditions on 
A for the solvability of (1.12.1) over a Bezout domain. First we need the 
following lemma: 

Lemma 1.12.1 Let Q ^ A e D™ x ". Then there exist P e U m , U € 
GL(ra,B B ) such that 

C = [dj] = PAU, 
(1.12.5) c u ^ 0, i = l,...,rank A, 

Cij = if cither j > i or j > rank A. 

Proof. Consider the matrix A T . By interchanging the columns of A T , 
i.e. multiplying A T from the right by some permutation matrix P T , we 
can assume that the Hermite normal form of A T P T satisfies rij = i, i — 
1,..., rank A. □ 



Theorem 1.12.2 Let D be a Bezout domain. Then the system (1.12.1) 
is solvable if and only if 

(1.12.6) r = rank A = rank A, S T (A) = S T (A). 

Proof. Assume first the existence of x e D™ which satisfies (1.12.2). 
Hence (1.12.4) holds, i.e. the first part of (1.12.6) holds. As any minor 
r x r of A is a minor of A we deduce that 8 r (A)\S r (A). (1.12.2) implies that 
b is a linear combination of the columns of A. Consider any r x r minor of 
A which contains the n + 1-st column b. Since b is a linear combination of 
columns of A it follows that S r (A) divides this minor. Hence S r (A)\S r (A), 
which establishes the second part of (1.12.6). (Actually we showed that if 
(1.12.1) is solvable over D G then (1.12.6) holds.) 

Assume now that (1.12.6) holds. Let 

VA = B= [B,b] G D mx(n+1) , V G GL(m, D) 

be Hermite's normal form of A. Hence B is Hermite's normal form of A. 
Furthermore 

V A = B, rank B = rank A = rank A = rank B = r, 
S r (B) = S r (A) = 5 r {A) = S r (B). 

Hence n r in Hermite's normal form of A is at most n. Note that the last 
m — r equations of Bx — b are the trivial equations = 0. That is, it is 
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enough to show the solvability of the system (1.12.2) under the assumptions 

(1.12.6) with r — m. By changing the order of equations in (1.12.1) and 
introducing a new set of variables 

(1.12.7) y=U~ 1 x, U e GL(n,D), 
we may assume that the system (1.12.2) is 

(1.12.8) Cy = d, C = PAU, d = (d 1 ,...,d m ) T = Pb, 

where C is given as in Lemma 1.12.5 with r — m. Let C = [C,d]. It is 
straightforward to see that A ~ C, A <~ C. Hence 

rank C = rank A = rank A = rank C = m, 5 m (C) = <5 m (A) = 5 m (A) = S m (C). 

Thus it is enough to show that the system (1.12.8) is solvable. In view of 
the form of C the solvability of the system (1.12.8) over D is equivalent the 
solvability of the system 

(1.12.9) Cy = d, C=[c ij }Z j=1 eB mxm , y = {y„...,y m ) T . 

Note that S m (C) = S m (C) = dct C. Cramer's rule for the above system in 
the quotient field F of D yields 

det Ci 

y;= ^, 1 = 1, ...,m. 

y detC 

Here Ci is obtained by replacing column i of C by d. Clearly det Ci 
is an m x m minor of C up to the factor ±1. Hence it is divisible by 
8 m (C) ee 5 m (C) = det (C). Therefore y { e D, i = 1, m. □ 



Theorem 1.12.3 Let A e D^ x ™. Then Range A and Kcr A are mod- 
ules in and having finite bases with rank A and nul A elements 
respectively. Moreover, the basis of Kcr A can be completed to a basis of 

Proof. As in the proof of Theorem 1.12.2 we may assume that rank A = 
m and A = C, where C is given by (1.12.5) with r = m. Let e l7 ...,e„ be 
the standard basis of Dg. Then Ce l7 ...,Ce m is a basis in Range C and 
e TO+1 , e„ is a basis for Ker A. □ 

Let A <G Dq x ™. Expand any q x q minor of A by any q — p rows, where 
1 < p < q. We then deduce 

(1.12.10) 5 p (A)\d g (A) for any 1 < p < q < min(m,n). 
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Definition 1.12.4 For A e D™ x ™ let 

l M)--=jr^, .7 = 1,..., rank A, (<5 (A) = 1), 
ij(A) = for rank A < j < min(m, n), 

be the invariant factors of A. ij(A) is called a trivial factor if ij(A) is 
invertible in D G . 

Suppose that (1.12.1) is solvable over O^. Using the fact that b is 
a linear combination of the columns of A and Theorem 1.12.2 we get an 
equivalent version of Theorem 1.12.2. (See Problem 2.) 

Corollary 1.12.5 Let A e D™ x ", b e D™. Then the system (1.12.1) 
is solvable over V>b if and only if 

(1.12.11) r = rank A = rank A, i k (A) = i k (A), k = 1, r. 

Problems 

1. Let A e Dq X ™. Assume that r = rank A. Show 

(a) 

(1.12.12) 

Sj(A) — Wjii(A) ■ ■ ■ ij(A), where uij is invertible in H>a for j = 1, . . . , n. 

(b) ii(A)\ij(A) for j = 2, . . . ,r. (Hint: Expand any minor of order 
j by any row.) 

(c) Let 2 < k,2k-l < j <r. Thenii(A) . . . i k (A)\i 3 - k+1 (A) . ..ij{A). 

2. Give a complete proof of Corollary 1.12.5. 

3. Let AeBg Xn . Assume that all the pivots in HNF of A T arc invert- 
ible elements. Show 

(a) Any basis of Range A can be completed to a basis in Wg. 

(b) h(A) = ... = i rankA (A) = 1. 

4. Assume that D = D^, M is a D-module with a basis, M^Mj are 
finitely generated modules of M. Show 

(a) Mi n M 2 has a basis which can be completed to bases in Mj 
and M 2 . 
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(b) Mi = Mi n M 2 Ni for i = 1, 2, where each N 4 has a basis, 
dim M, = dim (Mi n M 2 ) + dim N h i = 1,2, 

Mi + m 2 = (Mi n m 2 ) e Ni e n 2 . 

In particular, dim (M 1 +M 2 ) = dim M^dim M 2 -dim (M 1 <1 
M a ). 



1.13 Smith normal form 



A matrix D = [dij] G 



xn is called a diagonal matrix if dij — for all 
i ^ j. The entries rfn, d«, £ = min(m, n) are called the diagonal entries 
of D. D is denoted as D — diag(rfn, 



Theorem 1.13.1 Let ^ A G D mx ™. ^ssztrae £/iat D is an elementary 
divisor domain. Then A is equivalent to a diagonal matrix 



(1.13.1) 

Furthermore 

(1.13.2) 



B = diag(ii(A), ...,i r (A),Q, ...,0), r = rank A. 



for j = 2, ...,rank A. 



Proof. Recall that an elementary divisor domain is a Bezout domain. 
For n = 1 Hermite's normal form of A is a diagonal matrix with i\(A) = 
Si (A). Next we consider the case m = n = 2. Let 



A 1 = WA = 



a b 
c 



W G GL(2,D), 

= D^u there exists p,q,x,y G 



be Hermite's normal form of A. As D : 
such that 

(pa:) a + (py)6 + (qy)c = (a, 6, c) = 5i(A). 
Clearly (p, g) = (x, y) = 1. Hence there exist p, g, x, y such that 



Let 



Thus 



pp 


- qq 


— xx — yy 


= 1. 


V = 


p q 




U = 


x y 




q p_ 






y x_ 


G = 


VAU 






912 








321 


922 
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Since Si(G) = Si(A) we deduce that 8\(A) divides gi2 and 321- Apply 
appropriate elementary row and column operations to deduce that A is 
equivalent to a diagonal matrix C — dia,g(i 1 (A) , <f 2 )- As 82(C) — ii(A)d 2 = 
82(A) we see that C is equivalent to the matrix of the form (1.13.1), where 
we can assume that d 2 = 12(A). Since ii(A)\d,2 we have that ii(A)|i 2 (A). 
We now prove the theorem in the case to > 3, n = 2 by induction starting 
from m = 2. Let A = [a l0 ] e D mx2 and denote by A = [ay]™"}If. Use the 
induction hypothesis to assume that A is in the form (1.13.1). Interchange 
the second row of A with the last one to obtain A\ € D mx2 . Apply simple 
row and column operations on the first two rows and columns of A\ to 
obtain A 2 = [off] € D mx2 , where = i\(A). Use the elementary row 
and column operations to obtain A 3 of the form 



Recall that i\(A) divides all the entries of A 4 . Hence A 4 = ii(A)B 4: and 
i\(Ai) — ii(A)ii(B±). Use simple row operations on the rows 2,...,m of 
A3 to bring B4 to a diagonal form. Thus A is equivalent to the diagonal 
matrix 

C = diag(ii(A),ii(A)ii(E 4 )) G D mx2 . Recall that 

^(A) - 8 X (A) = 61(C), 62(A) = 82(C) = i^A)^) - ^(A^A)^). 

Thus ii (A) \ii(A 4 ) so h(A) = i\(C) and i 2 (A) = ii(A 4 ). Hence C is equiv- 
alent to B of the form (1.13.1) and i l (A)\i 2 (A). 

By considering A T we deduce that we proved the theorem in the case 
min(m,n) < 2. We now prove the remaining cases by a double induc- 
tion on to > 3 and n > 3. Assume that the theorem holds for all ma- 
trices in D( m_1 ) xrl for n = 2,3,... Assume that m > 3 and is fixed, 
and theorem holds for any E e D mx (" _1 ' for n > 3. Let A = [ay] e 
jynxn ^ £ = [ay]^"^ . Use the induction hypothesis to assume that 
A = diag(di, d{), / — min(m, n — 1). Here d\\di, i = 2,...,l. Interchange 
the second and the last column of A to obtain A\ = [off] G D" lXTl . Perform 

simple row operations on the rows of A\ and simple column operations on 

(2) 

the the first to — 1 columns of Ai to obtain the matrix A2 = [a\j ] G D mx ™ 
such that A 2 = [(^Xl^jZi — diag(a^\ off) is Smith's normal form 
of Ai — [a-^]™'™"*. The definition of A 2 yields that h(A) = . Use 
elementary row operations to obtain an equivalent matrix to A 2 : 



(1.13.3) 



A 3 



'h(A) ' 
Ai 



A 4 e D ( ' 



m— 1) x 1 



'ii(A) ' 
Ai 



Ai e D ( ' 



m — 1) x (n— 1) 
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As ii(A) — Si(A) — Si(A 3 ) it follows that ii(A) divides all the entries of 
A4. So A4 = ii(A)B4. Hence ij(A 4 ) = ii{A)ij(Bi) Use simple row and 
column operations on the last m — 1 rows and the last n — I columns of A 3 
to bring B4 to Smith's normal form using the induction hypothesis: 



A~A R 



h(A) 
i 1 {A)^a e {i 1 {B 4 ),...,i l {BA)) 



By induction hypothesis 

ij(B4)\ij + i(B4), j — 1, rank A — 1, ij(B 4 ) = 0, j > rank A — 1. 
Similar claim holds for A4. Hence 

S k (A) = S k (A 5 ) = i 1 (A)i 1 (A 4 ) • ■■i k -i(A 4 ), k = 2, ...,rank A. 

Thus 

ij{A 4 ) = ij+i(A), j = 1, ...,rank A - 1 

and A 5 is equivalent to B given by (1.13.1). Furthermore, we showed 
(1.13.2). □ 

The matrix (1.13.1) is called the Smith normal form of A. 

Corollary 1.13.2 Let A,B e D™^™. Then A and B are equivalent if 
and only if A and B have the same rank and the same invariant factors. 

Over an elementary divisor domain, the system (1.12.2) is equivalent to 
a simple system 

ik(A)y k = c k , k = 1, ...,rank A, 

(1.13.4) 

= Cfe, k = rank A + 1, ...,m, 
(1.13.5) y = P~ 1 x, c=Qb. 

Here P and Q are the invertible matrices appearing in (1.10.9) and B is of 
the form (1.13.1). For the system (1.13.4) Theorems 1.12.2 and 1.12.3 are 
straightforward. Clearly 

Theorem 1.13.3 Let A e D^™. Assume that all the invariant factors 
of A are invertible elements in Bed- Then the basis of Range A can be 
completed to a basis ofD ED . 

In what follows we adopt 
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Normalization 1.13.4 Let A e F[x] mx ". Then the invariant polyno- 
mials ( the invariant factors) of A are assumed to be normalized polynomi- 
als. 

Notation 1.13.5 Let A, e D m * xn * for i = 1, . . . , k. Then ® k i=l A, = 
diag(A 1 , . . . , Ah) denote the block diagonal matrix B = [-By]* - =1 € D mx ™ ; 

where Bij e D miX " J for i,j = l,...,k, m = 5Z i=1 m*, n = X^=i n ji sucft 
fftaf -B^ = Aj and By = for i ^ j. 



Problems 



1. Let A 



p 
q 



fg 2 . Then A is equivalent to diag((p,g), 



2. Let A e D™ xn , B e D^ 9 . Suppose that either i s (A)\i t (B) or 
i t (B)\i s (A) for s = 1,..., rank A = a, t = l,...,rankB = /3. Show 
that the set of the invariant factors A©£> is {ii(A) 7 i a (^4), ii(B), ip 

3. Let M C N be D £fl modules with finite bases. Prove that there 
exists a basis Ui, u„ in N such that iiUi, i r u r is a basis in M, 
where ii, i r € Ded and ij for j = 1, r — 1. 

4. Let M be a D-module and N„N 2 cMbe submodules. N t and N 2 
are called equivalent if there exists an isomorphism T E Horn (M, M) 
(T- 1 e Horn (M, M)) such that TNj = N 2 . Suppose that M, N 1; N 2 
have bases [ui,...,u m ], [vi,...,v n ] and [wi,...,w„] respectively. Let 



(1.13.6) 



^dijiii, w J= ^i)yU„ j = l,...,n, 

i=l »=1 



-4= KiJi=j=i, #=[&»;' 



Show that N x and N 2 are equivalent if and only if A ~ £>. 

5. Let N C M be D modules with bases. Assume that N has the division 
property: if ax e N for ^ a € D and x e M then x e N. Show that 
if D is an elementary divisor domain and N has the division property 
then any basis in N can be completed to a basis in M. 

6. Let D be elementary divisor domain. Assume that N C D m is a 
submodule with basis of dimension k e [l,m]. Let N' C D m be the 
following set. n e N' if there exists / a £ D such that an € N. 
Show that N' is a submodule of D m , which has the division property 
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Furthermore, N' has a basis of dimension k which can be obtained 
from a basis of N as follows. Let wi, . . . , be a basis of N. Let 
W € D mxfc be the matrix whose columns are w 1; . . . , wj.. Assume 
that D = diag(ni, . . . , n^) is the Smith normal form of W. So W — 
UDV,U e GL(m, D), V e GL(fc,B). Let ui, . . . ,u fe are the first k 
columns of U. Then ui, . . . ,Ufe is a basis of N'. 

1.14 The ring of local analytic functions in 
one variable 

In this section we consider applications of the Smith normal to the system of 
linear equations over Ho, the ring of local analytic functions in one variable 
at the origin. In 1.3 we showed that the only noninvertible irreducible 
element in H is z. Let A e H nxrl . Then A = A(z) = [ay(3)]£:? =1 and 
A(z) has the McLaurin expansion 

oo 

(1.14.1) A(z) = Y J A k z\ A k eC mxn , k = 0,..., 

k=0 

which converges in some disk \z\ < R(A). Here R(A) is a positive number 
which depends on A. That is, each entry atj(z) has convergent McLaurin 
series for \z\ < R(A). 

Notations and Definitions 1.14.1 Let A E H " xn . Then local in- 
variant polynomials (the invariant factors) of A are normalized to be 

(1.14.2) i k (A) = z tk{A \ < n(A) < i 2 {A) < ... < i r (A), r = rank A. 

The number L r (A) is called the index of A and is denoted by r\ = r/(A). For 
a nonnegative integer p denote by n p = n p (A)-the number of local invariant 
polynomials of A whose degree is equal to p. 

We start with the following perturbation result. 

Lemma 1.14.2 Let A,B e H " x ™. Let 

(1.14.3) C(z) = A(z) + z k+1 B(z), 



where k is a nonnegative integer. Then A and C have the same local in- 
variant polynomials up to degree k. Moreover, if k is equal to the index of 
A, and A and C have the same ranks then A is equivalent to C. 
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Proof. Since H is an Euclidean domain we may already assume that 
A is in Smith's normal form 

(1.14.4) A = diag(z tl ,...,z^,0,...,0). 

Let s — X^j=o K j(A). Assume first that s > t e N. Consider any any t x t 
submatrix D(z) of C(z) — [cij(z)]. View det D(z) as a sum of t\ products. 
As k + 1 > i t it follows each such product is divisible by z Ll+ --- +Lt . Let 
D(z) — [cjj(z)]* - =1 . Then the product of the diagonal entries is of the 
form + zO(z)). All other t! — 1 products appearing in det D(z) 

are divisible by z n+-H-2+2(k+i) _ Hence 

(1.14.5) <5 t (C) - z' 1+ -+" = 5 t (A), t=l, 
which implies that 

(1.14.6) it (C) = tt (A), t=l,...,a. 

As s — Y^!j=o K j (^) ^ follows that 

/Sj-(C) = Kj(A), j = 0,...,k-l, K k (A)<K k (C). 
Write A = C — z k+1 B and deduce from the above arguments that 

(1.14.7) Kj (C) = Kj(A), j = 0,...,k. 

Hence A and C have the same local invariant polynomials up to degree k. 
Suppose that rank A = rank C. Then (1.14.6) implies that A and C have 
the same local invariant polynomials. Hence A <~ B. □ 

Consider a system of linear equations over H 

(1.14.8) A(z)u - b(z), A{z) G H" x ™, b(z) e H™, 

where we look for a solution u(z) € H". Theorem 1.12.2 claims that the 
above system is solvable if and only if rank A = rank A = r and the g.c.d. 
of all r x r minors of A and A are equal. In the area of analytic functions 
it is common to try to solve (1.3.6) by the method of power series. Assume 
that A(z) has the expansion (1.14.1) and h(z) has the expansion 

oo 

(1.14.9) b(z) = ^b fe z fe , b fc eC m , jfe = 0,... 
Then one looks for a formal solution 

oo 

(1.14.10) u(z) = ^u fe z fc , u fe eC", fc = 0,..., 

fc=0 
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which satisfies 

k 

(1.14.11) ^Afe-j-Uj =b fe , 

j=o 

for fc = 0, ... . A vector u(z) is called a formal solution of (1.14.8) if (1.14.11) 
holds for any k £ Z + . A vector u(z) is called an analytic solution if u(z) is 
a formal solution and the series (1.14.10) converges in some neighborhood 
of the origin, i.e. u(z) £ H™. We now give the exact conditions for which 

(1.14.11) is solvable for k = 0, q. 

Theorem 1.14.3 Consider the system (1.14.11) for k = 0,...,q £ Z + . 

T7ien iftzs system is solvable if and only if A(z) and A(z) have the same 
local invariant polynomials up to degree q: 

(1.14.12) Kj(A) = Kj(A), j = 0,...,q. 

Assume that the system (1.14.8) is solvable over Ho. Let q = rj(A) and 
suppose that Uo,...,u g satisfies (1.14.11) for k — 0,...,q. Then there exists 
u(z) G Hq satisfying (1.14.8) and u(0) = u . 

Let q e Z + and W 9 C C" be the subspace of all vectors Wo such that 
Wo, ...,Wq is a solution to the homogenous system 

k 

(1.14.13) ^A fc _ jWj = 0, k = 0,...,q. 

3=0 

Then 

(1.14.14) dimW, = n-^K 3 (4) 

3=0 

In particular, for i] = rj(A) and any w € W r) there exists w(z) £ Hq such 
that 

(1.14.15) A(z)w(z) = 0, w(0)=w . 
Proof. Let 

u fc = (u k ,i, ...,u fe ,„) T , k = 0,...,q. 

We first establish the theorem when A(z) is Smith's normal form (1.14.4). 
In that case the system (1.14.11) reduces to 

Uk—i s ,s = bk.s if L 8 ^ k, 

(1.14.16) 

= bk, s if cither i s > k or s > rank A. 



1.14. THE RING OF LOCAL ANALYTIC FUNCTIONS IN ONE VARIABLFA5 



The above equations are solvable for k = 0,...,q if and only if z Ls divides 
b s (z) for all l s < q, and for i s > q, z q+1 divides b s (z). If t s < q then subtract 
from the last column of A the s-column times ^fif^. So A is equivalent to 
the matrix 

Aiiz) = diag^ 11 , ...,*") 8 z q+1 A 2 (z), 
l = J2 Kj (A), A 2 e^- l)Mn+1 - [) . 

1=0 

According to Problem 2 the local invariant polynomials of A\(z) whose 
degree does not exceed q are z 11 , z il . So A(z) and A\(z) have the same 
local invariant polynomials up the degree q. Assume now that A and A 
have the same local invariant polynomial up to degree q. Hence 

= 5 k (A) = S k (A), k = l, 

The first set of the equalities implies that z L3 \b s {z), s = 1, ...,Z. The last 
equality yields that for s > I, z q+1 \b s (z). Hence (1.14.11) is solvable for 
fc = 0, q if and only if A and A have the same local invariant polynomials 
up to the degree q. 

Assume next that (1.14.8) is solvable. Since A(z) is of the form (1.14.4) 
the general solution of (1.14.8) in that case is 

m j( z ) =& 4^> j = 1, ...,rank A, 

Z 3 

Uj(z) is an arbitrary function in H , j = rank A + 1, n. 

Hence 

^•(0) = ^, j = 1, ...,rank A, 

(1.14.17) 

Uj(Q) is an arbitrary complex number, j = rank A + 1, n. 

Clearly (1.14.16) implies that u _ s — u s (0) for k = i s . The solvability of 
(1.14.8) implies that b s (z) = for s > rank A. So u ,s is not determined 
by (1.14.16) for s > rank A. This proves the existence of u(z) satisfying 
(1.14.8) such that u(0) = u . Consider the homogeneous system (1.14.13) 
for k = 0, q. Then wo yS = for i s < and otherwise uo.s is a free variable. 
Hence (1.14.14) holds. Let q = r\ = 77(A). Then the system (1.14.13) 
implies that the coordinates of w satisfy the conditions (1.14.17). Hence 
the system (1.14.15) is solvable. 
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Assume now that A(z) £ H™ x " is an arbitrary matrix. Theorem 1.13.1 
implies the existence of 

P(z) £ GL(n,H ), Q(z) £ GL(m,H ) such that 

Q{z)A(z)P(z) = B(z) = diag(z tl , ...,z Lr ,0, ...,0), < n < ... < i r , r = rank A. 

It is straightforward to show that P(z) £ GL(n,H ) if and only if P(z) £ 
H" x ™ and P(0) is invertiblc. To this end let 

OO 

P{z) = Y,Pkz\ fteC x ",^0,.., detPo^O, 

k=0 

oo 

Q{z) = ^Q k z k , Q k £ C" x ", k = 0,..., detg o ^0. 

fc=0 

Introduce a new set of variables v(z) and Vo, vi, ... such that 

u(z) = P(z)v(z), 

k 

u k = ^2Pk-jVj, k = 0, ... 

3=0 

Since det P =^ v(z) = P(z) _1 u(z) and we can express each v k in terms 
of Ufe,...,u for k = 0,1,... Then (1.14.8) and (1.14.11) are respectively 
equivalent to 

B(z)v(z) = c(z), c(z) = Q(z)b(z), 
k 

^Bk-jVj = c k , k = o,...,q. 

3=0 

As B ~ A and B = QA{P ©7i)~iwe deduce the theorem. □ 
Problems 

1. The system (1.14.8) is called solvable in the punctured disc if the 
system 

(1.14.18) A(z )u(z Q ) = b(zo), 

is solvable for any point < \z \ < R as a linear system over C for 
some R > 0, i.e. 

(1.14.19) rank A(z ) = rank A(z ), for all < |z | < R. 

Show that (1.14.8) is solvable in the punctured disk if and only if 
(1.14.8) is solvable over Mo~the quotient field of H . 
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2. The system (1.14.8) is called pointwisc solvable if (1.14.18) is solvable 
for all z in some open disk \zo\ < R. Show that (1.14.8) is pointwise 
solvable if and only if (1.14.8) is solvable over Mq and 

(1.14.20) rank A(0) = rank A(0). 

3. Let A(z) £ Hq IX ". A(z) is called generic if whenever the system 
(1.3.6) is pointwise solvable then it is analytically solvable, i.e. solv- 
able over Ho. Prove that A(z) is generic if and only if r](A) < 1. 

4. Let ft C C be a domain and consider the system 

(1.14.21) A(z)u = b(z), A(z) eR(n) mxn , b(z) eR(n) m . 

Show that the above system is solvable over H(f2) if and only if for 
each £ £ this system is solvable in H^. (Hint: As H(f2) is Ded it 
suffices to analyze the case where A(z) is in its Smith's normal form.) 

5. Let A(z) and h(z) satisfy the assumptions of Problem 4. A(z) is called 
generic if whenever (1.14.21) is pointwise solvable it is solvable over 
H(f2). Show that A(z) is generic if and only the invariant functions 
(factors) of A(z) have only simple zeros. (C is a simple zero of f £ 
R(Q) if /(C) - and /'(C) + 0.) 

6. Let A(z) £ H(fi) mx ", where n is a domain in C. Prove that the 
invariant factors of A(z) are invertible in ¥L(fl) if and only if 

(1.14.22) rank A(C) = rank A, for all ( £ Q. 

7. Let A(z) £ H(f2) mxn , where is a domain in C. Assume that 
(1.14.22) holds. View A(z) £ Horn (H(fi) n , H(fi) m ). Show that 
Range A(z) has a basis which can be completed to a basis in H(f2) m . 
(Hint: Use Theorem 1.14.5.) 

1.15 The local-global domains in C p 

Let p be a positive integer and assume that C C p is a domain. Consider 
the system of m nonhomogeneous equations in n unknowns: 



(1.15.1) 



A(z)u = b(z), A(z) £ R(n) mxn , b(z) e H(fl) 
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In this section we are concerned with the problem of existence of a 
solution u(z) e H(f2) m to the above system. Clearly a necessary condition 
for the solvability is the local condition: 

Condition 1.15.1 Let ft C C p be a domain. For each £ G ft the system 
A(z)u = b(z) has a solution u^(z) € H™. 

Definition 1.15.2 A domain ft C C p is called a local-global domain, 
if any system of the form (1.15.1), satisfying the condition 1.15.1, has a 
solution u(z) e H(J7) m . 

Problem 1.14.4 implies that any domain 17 C C is a local-global domain. 
In this section we assume that p > 1. Problem 1 shows that not every 
domain in C p is a local-global domain. We give a sufficient condition on 
domain ft to be a local-global domain. 

Definition 1.15.3 A domain fl C C p is called a domain of holomor- 
phy, if there exist f € H(fi) suc/i £/iat /or any larger domain Sli C C p , 
strictly containing ft, there is no f\ <E H(fii) which coincides with f on ft. 

The following theorem is a very special case of Hartog's theorem [GuR65] . 

Theorem 1.15.4 Let ft C C p ,p > 1 be a domain. Assume that ( £ Q 
and f e H(fi\{C». Then f G H(fi). 

Thus H(ri\{C}) is not domain of holomorphy. A simple example of 
domain of holomorphy is [GuR65]: 

Example 1.15.5 Let ft C C p be an open convex set. Then ft is domain 
of holomorphy. 

(See 7.1 for the definition of convexity.) The main result of this section 

is: 

Theorem 1.15.6 Let ft C C p ,p > 1 be domain of holomorphy. Then 
ft is a local-global domain. 

The proof needs basic knowledge of sheaves and is brought for the 
reader who was exposed to the basic concepts in this field. See for ex- 
ample [GuR65]. We discuss only very special type of sheaves which are 
needed for the proof of Theorem 1.15.6. 

Definition 1.15.7 Let fiCP be an open set. Then 
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1. ^(0,), called the sheaf of rings of holomorphic functions on fl, is the 
union all H({7), where U ranges over all open subsets of ft. Then 
for each ( e the local ring is viewed as a subset of JF(f2) and 
is called the stalk of over (. A function f £ H(J7) is called a 
section of on U. 

2. For an integer n > 1, T n (Q), called an F(Q)-sheaf of modules, is the 
union all H(U) n , where U ranges over all open subsets of Ct. Then 
for each ( £ ft the local module H™ is viewed as a subset of !F n (Q) 
and is called the stalk of !F n (Q) over (. (Note H™ is an module.) 
A vector u £ H(U) n is called a section on U. (If U = then H(U) n 
consists of the zero element 0.) 

3. T C T n (SX) is called a subsheaf if the following conditions holds. First 
T n H™([7) contains the trivial section /or each open set U C f2. 
Second, assume thatu£ H^n^v G H(V)"nJ" and W C t/nF is 
an open nonempty set. Then for any f,g £ H(W) i/ie vector fu+gv £ 
■FnH(W)". (Restriction property.) Third, i/u ~ -v onW then the 
section w € H"(£/UV), which coincides with u, v onU,V respectively, 
belongs to TO R n (U U V). (Extension property.) T c := .Fni^ 1 is 
ifte siaZfc o/ J 7 oijer (gft. 

(a,) Le£ ?7 oe an open subset offt. Then T{U) := J-C\T n {U) is called 
the restriction of the sheaf T to U. 

(b) Let U be an open subset of fl. The sections u lt . . . , Ufc £ T n 
H(U) n are said to generate T on U, if for any ( £ U is 
generated by u 1; . . . , over H^. T is called finitely generated 
overU if such u ± , . . . , £ TC\H{U) n exists. T is called finitely 
generated if it is finitely generated over ft. J 7 is called of finite 
type if for each for each £ £ Q there exists an open set C ft, 
containing (, such that F is finitely generated over U. (I.e. each 
Tq is finitely generated.) 

(c) T is called a coherent sheaf if the following two conditions hold. 
First T is finite type. Second, for each open set U C ft and for 
any q>\ sections u„ . . . ,u q £ J 7 H H(J?)" let Q C T q (U) be a 
subsheaf generated by the condition J2i=i fi u i = °- That is, Q 
is a union of all . . . , f q ) T £ H(V) q satisfying the condition 
YLl=i fi u i — f or a M °P en V £ U. Then Q is of finite type. 

The following result is a straight consequence of Oka's coherence theo- 
rem [GuR65]. 

Theorem 1.15.8 Let ft C C p be an open set. Then 
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• The sheaf T n {£i) is coherent. 

• Let A G H(f2) mxn be given. Let T C T(£l) be the subsheaf consisting 
of all u G H(U) n satisfying Au = o for all open sets U C Q. Then T 
is coherent. 

Note that .F n (fi) is generated by n constant sections Ui := (5^, . . . , S in ) T G 
H(f2) n ,i = i,. . . ,n. The following theorem is a special case of Cartan's 
Theorem A. 

Theorem 1.15.9 Let ft C C p be a domain of holomorphy. Let T C 
T n {^t) be a subsheaf defined in Definition 1.15.7. If T is coherent then T 
is finitely generated. 

Corollary 1.15.10 Let C C p be a domain of holomorphy and A G 
H(ft) mx ". Then there exists Ui,...u ; G H(fi) n , sucft ifcaf /or any ( £ fi, 
eiiery solution of the system Au = ewer H™ is of the form Y^i=i fi u i f or 
some fi,...,fi € H c . 

We now introduce the notion of s/iea/ cohomology of C jF„(f2). 

Definition 1.15.11 Le£ £1 C C p be an open set. LetU := {Ui C f2, i g 

1} &e an open cover of Q. (I.e. each Ui is open, and UjgxK = ^-j -for 
eac/i integer p > and p+1 tuples of indices (io, ■ ■ ■ ,i p ) G denote 

u- ■ ■— n p r/- 

Assume that T C jF„(i7) is a subsheaf. A p-cochain c is a map carrying 
each p+1 -tuples of indices (i , . . . , i p ) to a section TC\W l {U ia ...i p ) satisfying 
the following properties. 

1. c(i , ... ,i p ) = ifU io ... ip = 0. 

2. c(7r(io), • • • , 7r(i P )) = sgn(7r)c(io, ■ • ■ , ip) for any permutation ir : {0, . . . ,p} 
{0, . . . ,p}. (Note that c(io, ■ ■ ■ , i p ) is the trivial section if ij — ik for 

3 + k.) 

Zero cochain is the cochain which assigns a zero section to any (io • • ■ ip). 
Two cochains c, d are added and subtracted by the identity (c±d)(io . . . i p ) = 
c(i , . . . ,i p )±d(io, . . . ,i p ). Denote byC p (Q,!F,U) the group of p+1 cochains. 

The p — th coboundary operator S p : C p (fl, T ,U) — > C p+1 (fl, T, U) is 
defined as follows: 

P+i 

{5 p c)(i , ■ ■ -,i P +i) = y^{-l) J c(io, .,i p+ i), 

3=0 

where ij is a deleted index. Then p — th cohomology group is given by 
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1. H (tt,f,U) := Ker 6 . 

2. Forp>\ W(Q,TM) ■= Ker 5 p /Range 8 p -i. (See Problem 2.) 

Lemma 1.15.12 Let the assumptions of Definition 1.15.11 hold. Let 
c £ C°(fi, T,U). Then c £ H°(Q,!F,U) if and only if c represents a global 
section u G f nH(fl)". 

Proof. Let c £ C°(fi,J r ,W). Assume that c e H°(fi, T, U). Let 
[To, f i be two open sets in U. Then c{iq) — c{ii) is the zero section on UqC\U\. 
Thus for each C G CT n t/i c(i )(C) = c(ii)(C). Let u(z) := c(i )(z) G C". 
It follows that u G H(f2) n . The extension property of subsheaf T yields 
that u £ T n H n (f2). Vice versa, assume that u £ T D H n (fi). Define 
c(to) - u|C/ . Then c e H°(fi, □ 

We identify H°(fi,.F,W) with the set of global sections f nH(O)". The 
cohomology groups H p (fi, J 7 , U) , p > 1 depend on the open cover U of Cl. By 
refining the covers of one can define the cohomology groups W (f2 , JF) , p > 
0. See Problem 3. Cartan's Theorem B claims [GuR65]. 

Theorem 1.15.13 Letfl C C p 6e domain of holomorphy. Assume that 
the sheaf J 7 given in Definition 1.15.7 is coherent. Then H p (f2, T) is trivial 
for any p > 1. 

Proof of Theorem 1.15.6. Consider the system (1.15.1). Let T be 
the coherent sheaf defined in Theorem 1.15.8. Assume that the system 
(1.15.1) is locally solvable over ft. Let £ G SI. Then there exists an open set 
[/(CS] such that there exists £ R n (U^) satisfying (1.15.1) over H(fTf). 
Let U := {UqX £ ft} be an open cover of ft. Define c £ C 1 (fi,J r ,W) by 
c((,rj) = uq — u v . Note that 

(<5 lC )(C, V, 0) = c(r,, 9) - c(C, 0) + c(C, r?) = 0. 

Hence c G Ker <5i. Since is coherent Cartan's Theorem B yields that 
H 1 (fi, T) is trivial. Hence H^Sl, T ,U) is trivial. (Sec Problem 3c.) Thus, 
there exists an element d £ C°(^l,!F,U) such that Sod = c. Thus for each 
C, f] £ Q such that U ( n U v there exist sections d(£) £ Jfl H n (J7 c ), d(?y) G 
.F n H"([/,) such that d(rj) - d(() = u ( - u v on U c n [/,. Hence d(»?) + 
u,, = d(C) + u c on U c n U v . Since Ad c = G K(U ( ) m it follows that 
A(d ( + u c ) = b G H(J7 c ) m . As d(r?) + u,, = d(C) + u c on U c n ^ it follows 
that all these section can be patched to the vector v G H(i?)™ which is a 
global solution of (1.15.1). □ 



Problems 
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1. Consider a system of one equation over C p ,p > 1 

v 

^Z %Ul = 1, U= («!, . . . ,U p ) T , Z= (2i,...,2 p ). 
t=l 

Let Q := O>\{0}. 

(a) Show that Condition 1.15.1 holds for fl. 

(b) Show that the system is not solvable at z = 0. (Hence it dose 
not have a solution u(z) eHjJ.) 

(c) Show that the system does not have a solution u(z) e H(J?) P . 
(Hint: Prove by contradiction using Hartog's theorem.) 

2. Let the assumptions of Definition 1.15.11 hold. Show for any p > 0. 

(a) S p+1 6 p = 0. 

(b) Range S p C Ker <5 p+ i. 

3. Let U = {U{,i E X}, V = {Vj, j <G J} be two open covers of an open 
set Cl C C p . V is called a refinement of U, denoted V ~< U, if each 
Vj is contained in some E/j. For each we fix an arbitrary f/j with 
Vj CUi, and write it as : V, C U^y Let J 7 be a subsheaf as in 
Definition 1.15.11. Show 

(a) Define <f> : C p (fi,.F,W) -> C p (fi,J r ,V) as follows. For c e 
C p (f2,JT,W) let (0(c))(io,.-.,i P ) € C p (ft,.F,V) be the restric- 
tion of the section c(i(j ), . . . ,i(j p )) to Vj ...j . Then is a ho- 
momorphism. 

(b) </> is induces a homomorphism : H p (il, T ,U) — > H p (fi,.F, V). 
Furthermore, </> depends only on the covers V. (I.e., the choice 
of is irrelevant.) 

(c) By refining the covers one obtains the p — th cohomology group 
W{Cl,T) with the following property. The homomorphism (f> 
described in 3b induces an injective homomorphism that <j> : 
W(Q,T,U) -> R P {Q, T) for p > 1. (Recall that H°(fi, T, U) = 
fnH n (fi).) In particular, W{Q.,F) is trivial, i.e. W{Q.,F) = 
{0}, if and only if each W(fl, T,U) is trivial. 

1.16 Historical remarks 

Most of the material in Sections 1.1-1.10 are standard. See [Lan58], [Lan67] 
and [vdW59] for the algebraic concepts. Consult [GuR65] and [Rud74] for 
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the concepts and results concerning the analytic functions. See [Kap49] for 
the properties of elementary divisor domains. It is not known if there exists 
a Bezout domain which is not an elementary divisor domain. Theorem 1.5.3 
for n = C is due to [Hel40]. For §1.10 sec [CuR62] or [McD33]. Most of §1.11 
is well known, e.g. [McD33]. §1.12 seems to be new since the underlying 
ring is assumed to be only a Bezout domain. Theorems 1.12.2 and 1.12.3 
are well known for an elementary divisor domain, since A is equivalent to a 
diagonal matrix. It would be interesting to generalize Theorem 1.12.2 for 
D = ¥[x\, ...,x p ] for p > 2. The fact that the Smith normal form can be 
achieved for Bed is due to Helmer [Hel43]. More can be found in [Kap49]. 

Most of the results of §1.14 are from [Fri80b]. I assume that Theorem 
1.15.6 is known to the experts. 
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Chapter 2 

Canonical Forms for 
Similarity 

2.1 Strict equivalence of pencils 

Definition 2.1.1 A matrix A(x) <G D[x] mx ™ is a pencil if 

(2.1.1) A{x) = A + xA 1 , A),^ieD mx ™. 

A pencil A(x) is regular if m = n and det A(x) 7^ 0. Otherwise A{x) is a 
singular pencil. Two pencils A{x) 1 B{x) G D[x] mxrl are strictly equivalent 
if 

(2.1.2) 

A{x)~B{x) ^=> B{x) = QA(x)P, P e GL(n,D), Q e GL(m,D). 

The classical works of Weierstrass [Wci67] and Kronecker [Kro90] clas- 
sify the equivalence classes of pencils under the strict equivalence relation 
in the case D is a field F. We give a short account of their main results. 

First note that the strict equivalence of A(x),B(x) implies the equiva- 
lence of A(x), B{x) over the domain D[x]. Furthermore let 

(2.1.3) B(x) = Bo + xBl 
Then the condition (2.1.2) is equivalent to 

(2.1.4) B = QA P, B x = QAxP, P e GL(n,D), Q e GL(m,B). 

Thus we can interchange A a with A\ and B with B x without affecting the 
strict equivalence relation. Hence it is natural to consider a homogeneous 
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pencil 

(2.1.5) A(x ,x 1 ) = x A +x 1 A 1 . 

Assume that D is By. Then Du[x ,xi] is also By (Problem 1.4.6.) In 
particular B[/[xo,xi] is Bp. Let S k {xo, Xi), i k (xo,Xi) be the invariant de- 
terminants and factors of A(xo, x\) respectively for fc = 1, rank A(x ,xi). 

Lemma 2.1.2 Let A(xo,Xi) be a homogeneous pencil over B[/[xo,Xi]. 
Then its invariant determinants and the invariant polynomials 
Sk{xo,Xi), ik{xo,Xi), fc = 1, rank A(xq, xi) are homogeneous polynomi- 
als. Moreover, if 8 k {x) andik(x) are the invariant determinants and factors 
of the pencil A(x) for k = 1, rank A(x), then 

(2.1.6) 6 k (x) = 6 k (l,x), i k (x) = i k (l,x), k = 1, ...,rank A(x). 

Proof. Clearly any fc x fc minor of A(xq,x\) is either zero or a 
homogeneous polynomial of degree fc. In view of Problem 1 we deduce that 
the g.c.d. of all nonvanishing k x k minors is a homogeneous polynomial 
<5 fc (x ,xi). As i k {x ,xi) = stl[ X (x'o,x\) Problcm 1 implies that i fc (x ,xi) is 
a homogeneous polynomial. Consider the pencil A{x) — A(\,x). So 5 k {x) - 
the g.c.d. of fc x fc minors of A(x) is obviously divisible by 5 k (l, x). On the 
other hand we have the following relation between the minors of A(xo,Xi) 
and A(x) 

(2.1.7) detA{x ,x 1 )[a,p]=x^detA{ X ^)[a,P], a,/3e[n] k . 

Xq 

This shows that XQ k S k (^) (p k = deg Sk(x)) divides any fc x fc minor of 
A(xo,xi). So XQ k 5k(^)\Sk(xo, Xi). This proves the first part of (2.1.6). So 

(2.1.8) 5fe(x ,xi) = x^ k (x^ k S k ( — )), p k = deg 6 k (x), fa > 0. 

Xo 

The equality 

. 4(x ,xi) 

lk{X0,Xl) = -z -. r 

dfe_l(x ,Xl) 

implies 

(2.1.9) i fe (x ,xi) = x^ k {xl k i k { — )), cr fe = deg i k (x), ip k > 0. 

Xq 

□ 



5 k (xo,Xi) and i k (xo,xi) are called the invariant homogeneous deter- 
minants and the invariant homogeneous polynomials (factors) respectively. 
The classical result due to Wcicrstrass [Wei67] states: 
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Theorem 2.1.3 Let A(x) e F[x]™ x " be a regular pencil. Then a pencil 
B(x) <G F[x]" xn is strictly equivalent to A(x) if and only if A(x) and B(x) 
have the same invariant homogeneous polynomials. 

Proof. The necessary part of the theorem holds for any A(x), B{x) 
which are strictly equivalent. Suppose now that A(x) and B{x) have the 
same invariant homogeneous polynomials. According to (1.4.4) the pencils 
A(x) and B(x) have the same invariant polynomials. So A(x) ~ B(x) over 
¥[x}. Therefore 

(2.1.10) W(x)B(x) = A(x)U(x), U(x),W(x) e GL(n, ¥[x]). 

Assume first that A\ and B\ are nonsingular. Then (sec Problem 2) it is 
possible to divide W(x) by A(x) from the right and to divide U(x) by B(x) 
from the left 

(2.1.11) W(x) = A(x)W 1 (x) + R, U(x) = U 1 (x)B(x) + P, 
where P and R are constant matrices. So 

A(x)(W 1 (x) - U 1 {x))B{x) = A(x)P - RB(x). 

Since A\,Bi e GL(n, F) we must have that W\(x) — U\{x), otherwise 
the left-hand side of the above equality would be of degree 2 at least (see 
Definition 2.1.5), while the right-hand side of this equality is at most 1. So 

(2.1.12) W 1 (x) = U 1 (x), RB(x)=A(x)P. 

It is left to show that P and Q are nonsingular. Let V(x) — W(x)^ 1 e 
GL(n,F[ar]). Then J = W(x)V(x). Let V(x) = B(x)Vi(x) + S. Use the 
second identity of (2.1.12) to obtain 

/ = {A(x)W 1 (x) + R)V(x) = A{x)Wx{x)V{x) + RV(x) = 
A(x)W 1 (x)V(x) + RB{x)V 1 {x) + RS = 
A{x)W l {x)V{x) + A{x)PV x {x) + RS = 
A{x){W l {x)V{x) + PVxix)) + RS. 

Since A x £ GL(n,F) the above equality implies 

W 1 (x)V(x) + PVi(x)=Q, RS = I. 

Hence R is invertible. Similar arguments show that P is invcrtiblc. Thus 
A(x)~B(x) if dct A u det B x ± 0. 
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Consider now the general case. Introduce new variables yo,yi- 
y = ax + bxi, y\ = cx n + dx\, ad — cb ^ 0. 

Then 

A iVo, yi) = VoA'o + yiA^, B(y ,y 1 ) = y B' + y 1 B' 1 . 
Clearly A(y , yi) and B(y n , j/i) have the same invariant homogeneous poly- 
nomials. Also A(yo,yi)^B(yo,yi) A{xq,xi)^B{xq 1 x\). Since 
A(xo,x\) and B{xq,x\) are regular pencils it is possible to choose a,b,c,d 
such that A[ and B[ are nonsingular. This shows that A(y , yi)~B(y , yi). 
Hence A(x)~B(x). □ 

Using the proof of Theorem 2.1.3 and Problem 2 we obtain: 

Theorem 2.1.4 Let A(x),B(x) G D[a;]" xn . Assume that Ax,B x G 
GL(n,D). Then A{x)~B(x) <^=> A(x) ~ B(x). 

For singular pencils the invariant homogeneous polynomials alone do 
not determine the class of strictly equivalent pencils. We now introduce 
the notion of column and row indices for A(x) G F[x] mx ". Consider the 
system (1.14.15). The set of all solutions w(x) is an F[x]-modulc M with 
a a finite basis w\(x), ...,w s (x). (Theorem 1.12.3.) To specify a choice of 
a basis we need the following definition. 

Definition 2.1.5 Let A G D[xi, x k ] mxn . So 

A(x u ...,x k )= A ^ a , A a eD mxn , 

\a\<d 

k 

a = (a u ...,a k ) € \a\ = ^a t , x a = x" 1 ..^". 

i=l 

(2.1.13) 

Then the degree of A{x\, ...,Xk) ^ (deg A) is d if there exists A a ^ with 
\a\ = d. Let deg = 0. 

Definition 2.1.6 Let A e ¥[x] mxn and consider the module M C ¥[x] n 
of all solutions of (1.14.15). Choose a basis w\(x), ...,w s (x), s = n— rank A 
in M such that w k (x) G M has the lowest degree among all w(x) G M 
which are linearly independent over F(x) o/wi, Wk-i(x) for k = 1, s. 
Then the column indices < a\ < a 2 < ... < a s of A(x) are given as 

(2.1.14) a fe = deg w k (x), k=l,...,s. 

The row indices < /3i < /3 2 < ... < /3 f , t = m — rank A, of A(x) are the 
column indices of A(x) T . 



2. 1 . STRICT EQUIVALENCE OF PENCILS 



59 



It can be shown [Gan59] that the column (row) indices are independent 
of a particular allowed choice of a basis w\(x), w s (x). We state the 
Kronecker result [Kro90]. (See [Gan59] for a proof.) 

Theorem 2.1.7 The pencils A(x),B(x) e F[a;] mx ™ are strictly equiva- 
lent if and only if they have the same invariant homogeneous polynomials 
and the same row and column indices. 

For a canonical form of a singular pencil under the strict equivalence see 
Problems 8-12. 



Problems 



1. Using the fact that D^[xi, x n ] is Djj and the equality (1.13.5) show 
that if a € V>u[%i, ■■■,x n ] is a homogeneous polynomial then in the 
decomposition (1.3.1) each pi is a homogeneous polynomial. 

2. Let 
(2.1.15) 



and A\ e 



W(x) = J2 W kX k , U(x) = J2 U kX k G «1 '1 
fc=0 fe=0 

Assume that A(x) = A + xA\ such that A e 
GL(n, D). Show that if p,q > 1 then 

W(x) = A(x)A^ 1 (W q x q - 1 )+W(x), U(x) = (U p x p - 1 )A^ 1 A(x)+U(x), 
where 

deg W(x) < q, deg U(x) < p. 

Prove the equalities (2.1.11) where R and P are constant matrices. 
Suppose that A\ = I. Show that R and P in (2.1.11) can be given as 

(2.1.16) R = J2(-Ao) k W k , P = ^[/ fe (-A ) fc . 



k=0 



k=0 



3. Let A(x) e D[/[x]™ x " be a regular pencil such that det A\ ^ 0. Prove 
that in (2.1.8) and (2.1.9) (fik = V'fe = for k = 1, n. (Use equality 
(1.12.12) for A(x) and A(x ,xi).) 



4. Consider the following two pencils 



A(x) = 



2 + x 

3 + x 
3 + x 



1 + x 3 + 3x 

2 + x 5 + 2x 
2 + x 5 + 2x 



, B(x) = 



2+x l+x 
1+x 2+x 
l+x l+x 



1 + x 

2 + x 
l + x 



over R[x]. Show that A(x) and B(x) are equivalent but not strictly 
equivalent. 
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5. Let 



Put 



A(x) = Y^A k x k G F[x] mx ™. 



fc=0 



A(x , Xl ) = J2 A kX q k x'l, q = degA(x). 

k=0 

Prove that ifc(xo, X\) is a homogeneous polynomial for A; = 1, rank A(x). 
Show that ii(l,or), ife(l, a;) are the invariant factors of A(x). 

6. Let A{x),B{x) e F[x] mx ™. A(x) and B(x) are called strictly equiva- 
lent (A(x)~B(x)) if 

B(x) = PA(x)Q, P e GL(m,F), Q e GL(n,F). 

Show that if A(x)~B(x) then A(x ,xi) and S(xo,Xi) have the same 
invariant factors. 



7. LetA(x),B(x) e F[x] mxn . Show that A(x)~B(x) 



A(x) T -B(x) T . 



8. (a) Let L m (x) £ F[x] mx (" i+1 ) be matrix with 1 on the main diagonal 
and x on the diagonal above it, and all other entries 0: 



L m (x) = 



1 x 
1 x 







Show that rank L m = m and a\=m. 

(b) Let 1 < a\ < . . . < a s , 1 < (3\ < . . . < (3 t be integers. Assume 
that B(x) = B + xBi e F[x] /xZ is a regular pencil. Show that 
A(x) = B(x) (B*—i L ai ©*=i Lp. has the column and the row 
indices 1 < a\ < . . . < a s , 1 < $\ < . . . < fit respectively. 



9. Show if a pencil A(x) is a direct sum of pencils of the below form, 
where one of the summands of the form 9a-9b appears, it is a singular 
pencil. 

(a) L m (x). 

(b) L m (x) T . 

(c) B(x) = B + xBi e F[x] /X ' is a regular pencil. 
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10. Show that a singular pencil A{x) is strictly similar to the singular 
pencil given in Problem 9, if and only if there are no column and row 
indices equal to 0. 

11. Assume that A(x) £ F[x] mx " is a singular pencil. 

(a) Show that A(x) has exactly k column indices equal to 0, if and 
only if it is strict equivalent to [0 mxfe ^i(x)], A^x) £ ¥[x] mx( - n - k \ 
where either A\(x) is regular or singular. If A\(x) is singular 
then the row indices of A\(x) are the row indices of A(x), and 
the column indices of A\(x) are the nonzero column indices of 
of A(x). 

(b) By considering A(x) T state and prove similar result for the row 
indices of A(x). 

12. Use Problems 8-11 to find a canonical from for a singular pencil A(x) 
under the strict equivalence. 

2.2 Similarity of matrices 

Definition 2.2.1 Let A,B £ D mxm . Then A and B are similar (A w 
B) if 

(2.2.1) B = QAQ-\ 
for some Q £ GL(m,D). 

Clearly the similarity relation is an equivalence relation. So D mxm is di- 
vided into equivalences classes which are called the similarity classes. For 
a D module M we let Horn (M) := Horn (M,M). It is a standard fact 
that each similarity class corresponds to all possible representations of 
some T £ Horn (M), where M is a D-module having a basis of m ele- 
ments. Indeed, let [ui, ...,u m ] be a basis in M. Then T is represented by 
A = [a,ij] £ O mxm , where 

m 

(2.2.2) Tuj = ^aijUi, j = i,...,m. 

i—i 

Let [ui,...,u m ] be another basis in M. Assume that Q £ GL(ro, D) is 
given by (1.10.5). According to (2.2.2) and the arguments of §1.10, the 
representation of T in the basis [iii, u m ] is given by the matrix B of the 
form (2.2.1). 

The similarity notion of matrices is closely related to strict equivalence 
of certain regular pencils. 
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Lemma 2.2.2 Let A,B£ jj mxm . Associate with these matrices the 
following regular pencils 

(2.2.3) A(x) = —A + xl, B{x) = -B + xI. 

Then A and B are similar if and only if the pencils A{x) and B(x) are 
strictly equivalent. 

Proof. Assume first that A w B. Then (2.2.1) implies (2.1.2) where 
P = Q- 1 . Suppose now that A(x)~B(x). So B = QAP, I = QP. That is 
P = Q- 1 and A w B. □ 

Clearly A(x)~B(x) => A(x) ~ B(x). 

Corollary 2.2.3 Let A, B e D™ xm . Assume that A and B are similar. 
Then the corresponding pencils A(x), B(x) given by (2.2.3) have the same 
invariant polynomials. 

In the case B>u = F the above condition is also a sufficient condition in view 
of Lemma 2.2.2 and Corollary 2.1.4 

Theorem 2.2.4 Let A, B e F mxm . Then A and B are similar if and 
only if the pencils A(x) and B(x) given by (2.2.3) have the same invariant 
polynomials. 

It can be shown (see Problem 1) that even over Euclidean domains the 
condition that A(x) and B(x) have the same invariant polynomials does 
not imply in general that A m B. 

Problems 

1. Let 



I— 1 


0" 


, B = 




H- 1 











5 





5 



Show that A(x) and B(x) given by (2.2.3) have the same invariant 
polynomials over 7L\x\. Show that A and B are not similar over Z. 

2. Let A(x) e Du[x] nxn be given by (2.2.3). Let h{x), i n {x) be the 
invariant polynomial of A(x). Using the equality (1.12.12) show that 
each ik(x) can be assumed to be normalized polynomial and 

n 

(2.2.4) ^deg ?fc (x) = n. 

fe=i 



3. Let A e F" x ". Show that A w A T . 
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2.3 The companion matrix 



Theorem 2.2.4 shows that if A £ F™**™ then the invariant polynomials 
determine the similarity class of A. We now show that any set of normalized 
polynomials i\(x), i n (x) £ V>u[ x ]i such that ij(x)\ij + i(x), j — l,...,n — 1 
and which satisfy (2.2.4), are invariant polynomials of xl — A for some 



A £ 



To do so we introduce the notion of a companion matrix. 



Definition 2.3.1 Let p(x) £ D[x] be a normalized polynomial 
p(x) - ~ m ' - ■~ m - 1 
Then C(p) = [cij]f £ D m> 



(2.3.1) 








x + a\X 

is the companion matrix of p(x) if 

i = 1, ...,m - 1, j = 1, ...,m, 
j = 1, 



C(p) = 



1 






.,m, 


1 










-ai 






1 

-oi 



Lemma 2.3.2 Let p(x) £ Dj/[a:] &e a normalized polynomial of degree 
m. Consider the pencil C(x) = xl — C(p). Then the invariant polynomials 
ofC(x) are 



(2.3.2) 



ii(C) = ... = i m -i(C) = l ) » ro (C)=p(a:). 



Proof. For k < m consider a minor of C{x) composed of the rows 1, k 
and the columns 2, k + 1. Since this minor is the determinant of a lower 
triangular matrix with —1 on the main diagonal we deduce that its value 
is (— l) fe . So <5fc(C) = 1, k = 1, ...,m — 1. This establishes the first equality 
in (2.3.2). Clearly <5 m (C) = det (x/ — C). Expand the determinant of C(x) 
by the first row and use induction to prove that det (xl — C) = p(x). This 



shows that i m (C) = 



_ yg _ 



= p(x). 



□ 



Using the results of Problem 2.1.13 and Lemma 2.3.2 we get: 

Theorem 2.3.3 Let Pj(x) £ V>u[x] be normalized polynomials of posi- 
tive degrees such that pj(x)\pj + \(x), j — l,...,k— 1. Consider the matrix 



(2.3.3) 



c(pi,...,p fc ) = e- =1 c( ft ). 



TTien i/ie nontrivial invariant polynomials of xl — C(p\, ...,Pk) ( «-e. i/iose 
polynomials which are not the identity element) are pi(x), ...,pk(x). 
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Combining Theorems 2.2.4 and 2.3.3 we obtain a canonical representa- 
tion for the similarity class in F™ x ". 

Theorem 2.3.4 Let A e F" x ™ and assume that pj(x) G ¥[x], j = 
l,...,k are the nontrivial normalized invariant polynomials ofxI — A. Then 
A is similar to C(pi, ...,Pk)- 

Definition 2.3.5 For A e F" x ™ the matrix C(pi, ...,Pk) is called the 
rational canonical form of A. 

Let F be the quotient field of D. Assume that A e D™ x ". Let C(pi, ...,p k ) 
be the rational canonical form of A in F nxn . We now discuss the case when 
C(p!,...,p k ) e D nx ™. Assume that D is D v . Let 6 k be the g.c.d of k x k 
minors of xl — A. So S k divides the minor p(x) = det (xl — A) [a, a], a = 
{l,...,fc}. Clearly p(x) is normalized polynomial in Dy[x]. Recall that 
Bu[x] is also (§1.4). 

According to Theorem 1.4.8 the decomposition of p(x) into irreducible 
factors in B(/[x] is of the form (1.4.4), where a — 1 and each ^(x)is a non- 
trivial normalized polynomial in Dj/[x]. Hence i k — is either identity 
or a nontrivial polynomial in [x] . Thus 

Theorem 2.3.6 Let A e D^ x ™. Then the rational canonical form 
C(pi, ...,pk) of A over the quotient field ¥ ofBu belongs to Dj^ x ". 

Corollary 2.3.7 Let A e C[xi, x m ]" xn . Then the rational canoni- 
cal form of A over C(xi, ...,x„) belongs to C[xi, x m ]" x ™. 

Using the results of Theorem 1.4.9 we deduce that Theorem 2.3.6 applies 
to the ring of analytic functions in several variables although this ring is 
not Ot/ (§1.3). 

Theorem 2.3.8 Let A E H(fi)" x ™ (CI C C m ). Then the rational 
canonical form of A over the field of meromorphic functions in CI belongs 
to R(Cl) nxn . 

Problems 

1. Let p(x) e By [x] be a normalized nontrivial polynomial. Assume that 
p(x) — pi(x)p2(x) , where Pi(x) is a normalized nontrivial polynomial 
in D^[x] for i = 1,2. Using Problem 1.13.1 and 2 show that xl — 
C(pi,p 2 ) given by (2.3.3) have the same invariant polynomials as 
xl — C(p) if and only if (pi,_P2) = 1- 



2.4. SPLITTING TO INVARIANT SUBSPACES 



65 



2. Let A E D^ x " and assume that pi(x), ...,pk(x) are the nontrivial 
normalized invariant polynomials of xl — A. Let 

(2.3.4) p j (x) = (M*)) mii -(M*)) m,i , J = 1, ■•■>*> 

where <f>i(x),..., <fri(x) are nontrivial normalized irreducible polynomi- 
als in B>u[x] such that (fa, 4>j) = 1 for i ^ j. Prove that 

i.k 

(2.3.5) m ik > 1, m lfc > m^ k _^ > ... > ran > 0, ^ = n. 

»,j=i 

The polynomials </>™ 13 for mjj > are called the elementary divisors 
of xl — A. Let 

(2.3.6) E = ® mtj>Q C{W). 

Show that xl — A and xl — E have the same invariant polynomials. 
Hence A m E over the quotient field F of Djj. (In some references E 
is called the rational canonical form of A.) 



2.4 Splitting to invariant subspaces 

Let V be an m dimensional vector space over F and let T G Horn (V). In 
§2.2 we showed that the set of all matrices A C F mxm , which represents 
T in different bases, is an equivalence class of matrices with respect to the 
similarity relation. Theorem 2.2.4 shows that the class A is characterized 
by the invariant polynomials of xl — A for some A E A. Since xl — A and 
xl — B have the same invariant polynomials if and only if A sa B we define: 

Definition 2.4.1 Let T E Horn (V) and let A E F mxm be a represen- 
tation matrix of T given by the equality (2.2.2) in some basis ui, ...,u m of 
V. Then the invariant polynomials ii(x), i m (x) of T are defined as the 
invariant polynomials of xl — A. The characteristic polynomial ofT is the 
polynomial det (xl — A) . 

The fact that the characteristic polynomial of T is independent of a 
representation matrix A follows from the identity (1.12.12) 

(2.4.1) det (xl- A) =p 1 (x)...p k {x), 

where p\(x) 1 ...,p k (x) are the nontrivial invariant polynomials of xl — A. In 
§2.3 we showed that the matrix C(pi, ...,p k ) is a representation matrix of 
T. In this section we consider another representation of T which is closely 
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related to the matrix E given in (2.3.6). This form is achieved by splitting 
V to a direct sum 

(2.4.2) V = eUUi, 

where each Uj is an invariant subspace of T defined as follows: 

Definition 2.4.2 Let V be a finite dimensional vector space over F 
and T G Horn (V). A subspace U C V is an invariant subspace of T 
(T -invariant) if 

(2.4.3) TV C U. 

U is called trivial if U = {0} or U = V. U is called nontrivial, (proper), 
if {0} ^U^V. U is called irreducible ifU can not be expressed a direct 
sum of two nontrivial invariant subspaces of T. The restriction of T to a 
T-invariant subspace U is denoted by T|U. 

Thus if V splits into a direct sum of nontrivial invariant subspaces of 
T, then a direct sum of matrix representations of T on each XJj gives a 
representation of T. So, a simple representation of T can be achieved by 
splitting V into a direct sum of irreducible invariant subspaces. To do so 
we need to introduce the notion of the minimal polynomial of T. Consider 
the linear operators I = T a ,T,T 2 , T m , where I is the identity operator 
(7v = v). As dim Horn (V) = to 2 , these m 2 + 1 operators are linearly 
dependent. So there exists an integer q G [0,m 2 ] such that I,T, ...,T 9_1 
arc linearly independent and I,T,...,T q are linearly dependent. Let G 
Horn (V) be the zero operator: Ov = 0. For G ¥[x] let (f>(T) be the 
operator 

i i 

i=0 i=l 

(j) is annihilated by T if <f>(T) = 0. 

Definition 2.4.3 A polynomial tp(x) G F[x] is a minimal polynomial 
ofTG Horn (V) if ip(x) is a normalized polynomial of the smallest degree 
annihilated by T. 

Lemma 2.4.4 Letip(x) G ¥[x] be the minimal polynomial T G Horn (V). 
Assume that T annihilates <fr. Then tp\<j). 

Proof. Divide <p by ifi: 



<t>{x)= xO^O) + P(x), degp<degV- 
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As 4>{T) = ip{T) = it follows that p{T) = 0. From the definition of ip(x) 
it follows that p(x) = 0. □ 

Since ¥[x] is a unique factorization domain, let 

{4>i, 4>j) = 1 for 1 < i < j < I, deg > 1, i = 1, I, 

(2.4.4) 

where each <pi is a normalized irreducible polynomial if F[x]. 

Theorem 2.4.5 Let ip be the minimal polynomial of T G Horn (V). 
Assume that ip splits to a product of coprime factors given in (2.4.4). Then 
the vector space V splits to a direct sum (2.4.2), where each Uj is a non- 
trivial invariant subspace ofT\Uj. Moreover <p s - 3 is the minimal polynomial 
ofT\Vj. 

The proof of the theorem follows immediately from the lemma below. 

Lemma 2.4.6 Let ip be the minimal polynomial of T € Horn (V). As- 
sume that ip splits to a product of two nontrivial coprime factors 

(2.4.5) ip(x) =Mx)i>2 (x), deg > 1, i = 1,2, (ipi,ip 2 ) = 1, 
where each ipi is normalized. Then 

(2.4.6) V = U 1 ®U 2 , 

where each TJj is a nontrivial T -invariant subspace and ipj is the minimal 
polynomial ofTj := T\Uj. 

Proof. The assumptions of the lemma imply the existence of polyno- 
mials 9i (x) and 9 2 {x) such that 

(2.4.7) 0i(z)Vi(aO + W*)Mx) = !• 
Define 

(2.4.8) U, = {ueV: ^(T)u = 0}, j = 1,2. 

Since any two polynomials in T commute (i.e. p,{T)v(T) = v(T)p(T)) it 
follows that each Uj is T-invariant. The equality (2.4.7) implies 

I = MT)0i(T) + MT)0 2 (T). 
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Hence for any u € V we have 

u = ui + u 2 , Ul = V 2 (T)6» 2 (r)u e Ui, u 2 = Vi(T)6>i(T)u e U 2 . 

SoV = Ui + U 2 . Suppose that u e UinU 2 . Then ipi(T)u = -0 2 (T)u = 0. 
Hence 6»i(T)V>i(T)u = 6> 2 (T)V> 2 (T)u = 0. Thus 

u = Vi(T)u+ ^ 2 (T)u = 0. 

So Ui n U 2 = {0} and (2.4.6) holds. Clearly 7} annihilates ^j- Let ^ be 
the minimal polynomial of Tj. So j = 1,2. Now 

MT)MT)u = A(T)MT)(u 1+ u 2 ) = vi 2 (T)vi 1 (T)u 1 +vi 1 (r)v; 2 (T)u 2 = 0. 

Hence T annihilates - 0iV' 2 . Since i/j is the minimal polynomial of T we have 
V'l^lV'i^- Therefore ^ = j = 1,2. As deg tpj > 1 it follows that 
dimU,>l. □ 



Problems 

1. Assume that (2.4.6) holds, where TTJj C U,-, j = 1,2. Let Vj be 
the minimal polynomial of 7} = T\XJj for j = 1,2. Show that the 
minimal polynomial ip of T is equal to (^fej ' 

2. Let the assumptions of Problem 1 hold. Assume furthermore that 
ip = 4> s , where <fr is irreducible over ¥[x]. Then either ipi — ip or 

3. Let C(p) G D mxm be the companion matrix given by (2.3.1). Let 
e» = (<5a, •••,'5im) T , i = 1, --^m be the standard basis in D m . Show 

(2.4.9) C(p)ei = ej_i - a m _ m e m , i = l, ...,m, (e = 0). 

Prove that p(C) = and that any polynomial O^gG deg q < 

m is not annihilated by C. (Consider q(C)&i and use (2.4.9).) That 
is: p is the minimal polynomial of C(p). 

Hint: Use the induction on m as follows. Set fi = e m _i +1 for 
i = 1, . . . ,m. Let g = .x" 1 - 1 + a^" 1 - 1 + . . . + a m _i. Set Q = 


Om-i C(q) 

0, and the facts that C{p)U = QU for i — 1, . . . ,m — 1, C(p)f TO = 
fm+i + Qfm to obtain that p(C(p))fi = 0. Now use (2.4.9) to show 
that p(C(p))fi = for i = 2, . . . ,m + 1. 



Use the induction hypothesis on C(q), i.e. q(C(q)) 
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4. Let A e F mxm . Using Theorem 2.3.4 and Problems 1 and 3 show 
that the minimal polynomial ip of A is the last invariant polynomial 
of xl - A. That is: 

. , / x det (xl - A) 

(2.4.10) V* = y \ 

o m -i{x) 

where <5 m _i(x) is the g.c.d. of all (to — 1) x (to — 1) minors of xl — A. 

5. Show that the results of Problem 4 apply to A <E D™ xm . In particular, 
if A ss i? then A and -B have the same minimal polynomials. 

6. Deduce from Problem 4 the Cayley- Hamilton theorem which states 
that T € Horn (V) annihilates its characteristic polynomial. 

7. Let A e D mxm . Prove that A annihilates its characteristic polyno- 
mial. (Consider the quotient field F of D.) 

8. Use Problem 6 and Lemma 2.4.4 to show 

(2.4.11) degV><dimV. 

9. Let tp = (f) s , where <p is irreducible in F[x]. Assume that deg tp — 
dim V. Use Problems 2 and 8 to show that V is an irreducible in- 
variant subspace of T. 

10. Let p(x) e ¥[x] be a nontrivial normalized polynomial such that p = 
(f> s , where is a normalized irreducible in F[x]. Let T e Horn (V) be 
represented by C(p). Use Problem 9 to show that V is an irreducible 
invariant subspace of T. 

11. Let T e Horn (V) and let E be the matrix given by (2.3.6), which is 
determined by the elementary divisors of T. Using Problem 10 show 
that the representation E of T corresponds to a splitting of V to a 
direct sum of irreducible invariant subspaces of T. 

12. Deduce from Problem 9 and 11 that V is an irreducible invariant 
subspace of T if and only if the minimal polynomial tp of T satisfies 
the assumptions of Problem 9 



2.5 An upper triangular form 

Definition 2.5.1 Let M be a H-module and assume that T e Horn (M) . 
A e D is an eigenvalue of T if there exists 0/ueM such that 



(2.5.1) 



Tu = Au. 
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The element, (vector), u is an eigenelement, (eigenvector), corresponding 
to X. An element ^ u is a generalized eigenelement, (eigenvector), if 

(2.5.2) (A/-T) fc u = 

for some positive integer k, where X is an eigenvalue of T. For T € D mxm 
A is an eigenvalue if (2.5.1) holds for some O^ue D m . The element u is 
eigenelement, (eigenvector), or generalized eigenelement, (eigenvector), if 
either (2.5.1) or (2.5.2) holds respectively. 

Lemma 2.5.2 Let T e D mxm . Then A is an eigenvalue of T if and 
only if X is a root of the characteristic polynomial det (xl — T) . 

Proof. Let F be the quotient field of D. Assume first that A is an 
eigenvalue of T. As (2.5.1) is equivalent to (XI — T)u = and 0, then 
above system has a nontrivial solution. Therefore det (XI — T) = 0. Vice 
versa, if det (XI — T) = then the system (XI — T)v = has a nontrivial 
solution v e F m . Then there exists O^aeD such that u := ov e B m and 
Tu = Au. □ 



Definition 2.5.3 A matrix A = [ay] G D mxm i s an upper, (lower), tri- 
angular ifaij = for j < i, (j > i). Let UT(m,D), (LT(m, D)) C D mxm 
be the ring of upper, (lower), triangular mxm matrices. Let UTG(m,D) = 
UT(m,D) n GL(to,B), (LTG(ra, D) = LT(m,D) n GL m (B)). 

Theorem 2.5.4 Let T e D mxm . Assume that the characteristic poly- 
nomial of T splits to linear factors over B 

m 

(2.5.3) det (xl - T) = J[(x - A 4 ), A< e B, i = 1, ...,m. 

i=l 

Assume furthermore that B is a Bezout domain. Then 

(2.5.4) T=QAQ-\ Q e GL(m,B), i=[a ij ]™eUT(m,D), 

such that an, a mm are eigenvalues X\, A TO appearing in any spec- 
ified order. 

Proof. Let A be an eigenvalue of T and consider the set of all u e B m 
which satisfies (2.5.1). Clearly this set is a B-module M. Lemma 2.5.2 
yields that M contains nonzero vectors. Assume that B is B#. According 
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to Theorem 1.12.3 M has a basis Ui, ...,Uk which can be completed to a 
basis ui,...,u m in D m . Let 

m 

(2.5.5) rui = 5^6jiUj, i = l,...,m, B = [by] e D mxm . 

A straightforward computation shows that T s=s B. As Tui = Auj, i = 
1, we have that bj\ = for j > 1. So 

det (a;/ — T) = dct (xJ - B) = (x - A)dct (x/ - B), 

where B = [6»j]™j =2 € D^™ -1 )^" 1-1 ). Here the last equality is achieved by 
expanding det (xl — B) by the first column. Use the induction hypothesis 
to obtain that B & A\, where A\ e UT(m — 1, D), with the eigenvalues of 
B on the main diagonal of A\ appearing in any prescribed order. Hence 
T«C= [cij]?, where C e UT(m,B) with c u = A, [cy]^- =2 = Ai. □ 

The upper triangular form of A is not unique unless A is a scalar matrix: 
A = al. See Problem 1. 

Definition 2.5.5 Let T e D mxm and assume that (2.5.3) holds. Then 
the eigenvalue multiset ofT is the set S(T) = {A l7 A m }. The multiplicity 
of X € S(T), denoted by m(A), is £/ie number of elements in S(T) which are 
equal to X. X is called a simple eigenvalue ifm(X) = 1. T/ie spectrum ofT, 
denoted by spec (T), is £/ie sef o/ aZZ distinct eigenvalues ofT: 

(2.5.6) ^ m(A) = m. 

Aespec (T) 

ForT e C mxm arrange the eigenvalues ofT in the decreasing order of their 
absolute values (unless otherwise stated): 

(2.5.7) | Ai| > ••• > |A m | >0, 

The spectral radius ofT, denoted by p(T), is equal to \X\\. 

Problems 

1. Let Q correspond to the elementary row operation described in Def- 
inition 1 . 1 1 .6 (iii) . Assume that A e UT(ra,D). Show that if j < i 
then QAQ^ 1 e UT(m, D) with the same diagonal as A. More gen- 
eral, for any Q G UTG m (D) QAQ- 1 e UT(m,D) with the same 
diagonal as A. 
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2. Show that if T e D mxm is similar to A e UT(m,D) then the charac- 
teristic polynomial of T splits to linear factors over D[x]. 

3. Let T e D mxm and put 

(2.5.8) det (xl -T)=x m + J2 a m - jX j . 

0=1 

Assume that the assumptions of Theorem 2.5.4 holds. Show that 
(2.5.9) 

(-l) k a k = ^2 det T[a,a] = s fe (Ai, A m ), k = l,...,m. 

a£[m] k 

Here Sk(x\, x m ) is the k — th elementary symmetric polynomial of 
Xi, ...,x m . The coefficient — oi is called the trace of A: 

m m 

(2.5.10) txA = Y^a-ii = J2 Xi - 

i=l i=l 

4. Let T € D mxm and assume the assumptions of Theorem 2.5.4. Sup- 
pose furthermore that D is By- Using the results of Theorem 2.5.4 
and Problem 2.4.5 show that the minimal polynomial ip(x) of T is of 
the form 

i 

^{x) = Y[{x - Ui) s \ 

i=l 

cti ^ ctj for i =^ j, 1 < Si < m,i :— m(a,), i = 1, Z, 

(2.5.11) 

where spec (T) = {ai, ...,ai}. (Hint: Consider the diagonal elements 
ofV(A).) 

5. Let T S B™ xm and assume that the minimal polynomial of T is given 
by (2.5.11). Using Problem 2.4.4 and the equality (2.4.1) show 

i 

(2.5.12) det(xI-T) = Y[(x- ai ) m \ 

i=i 

2.6 Jordan canonical form 

Theorem 2.5.4 and Problem 2.5.2 shows that T £ D mxm is similar to an 
upper triangular matrix if and only if the characteristic polynomial of T 
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splits to linear factors. Unfortunately, the upper triangular form of T is not 
unique. If D is a field then there is a special upper triangular form in the 
similarity class of T which is essentially unique. For convenience we state 
the theorem for an operator T € Horn (V). 

Theorem 2.6.1 Let V be a vector space over the field F. Let T G 
Horn (V) and assume that the minimal polynomial tp(x) of T splits to a 
product of linear factors as given by (2.5.11). Then V splits to a direct sum 
of nontrivial irreducible invariant subspaces of T 

(2.6.1) V = Wi8...8W g . 

In each invariant subspace W(= Wj) it is possible to choose a basis con- 
sisting of generalized eigenvectors xi,...,x r such that 

Txi = A xi, 

(2.6.2) 

Tx fe+1 = A x fe+1 +x fe , fc=l,...,r-l, 

where Ao is equal to some a.{ and r < Sj. (For r = 1 the second part of 

(2.6.2) is void.) Moreover for each oii there exists an invariant subspace W 
whose basis satisfies (2.6.2) with A = on and r = Sj. 

Proof. Assume first that the minimal polynomial of T is 

(2.6.3) ip{x) = x s . 

Recall that tp(x) is the last invariant polynomial of T. Hence each nontrivial 
invariant polynomial of T is of the form x r for 1 < r < s. Theorem 2.3.4 
implies that V has a basis in which T is presented by its rational canonical 
form 

C(x ri ) ... 8 C(x r "), l<r 1 <r 2 <...<r k = s. 

Hence V splits to a direct sum of T-invariant subspaces (2.6.1). Let W be 
an invariant subspace in the decomposition (2.6.1). Then T := T|W has the 
minimal polynomial x r , 1 < r < s. Furthermore, W has a basis xi, ...,x r 
so that T is represented in this basis by the companion matrix C(x r ). It 
is straightforward to show that X\, ...,x r satisfies (2.6.2) with A = 0. As 
W is spanned by x r , Tx r , T r_1 x r it follows that W is an irreducible 
invariant subspace of T. Assume now that the minimal polynomial of T 
is (x — Ao) s . Let Tq = T — XqI. Clearly x s is the minimal polynomial of 
To. Let (2.6.1) be the decomposition of V to invariant subspaces of T as 
above. In each invariant subspace W choose a basis for T as above. Then 
our theorem holds in this case too. 
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Assume now that the minimal polynomial of T is given by (2.5.11). Use 
Theorem 2.4.5 and the above arguments to deduce the theorem. □ 



Let 



(2.6.4) 



C{x n ) 



Sometimes we denote H n by H when the dimension of H is well defined. 

Let W = span (xi,x 2 , ...,x r ). Let T <= Horn (W) be given by (2.6.2). 
Then T is presented in the basis xi, ...,x r by the Jordan block XoI r + H r . 
Theorem 2.6.1 yields: 

Theorem 2.6.2 Let A e jr nx ™. Assume that the minimal polynomial 
tp(x) of A splits to linear factors as in (2.5.11). Then there exists P G 
GL(n,F) such that 

p- l AP = J, 

(2.6.5) J = e{ =1 ®f =1 (aJ miJ + H mtj ), 

(2.6.6) 1 < m iqi < m. iqi _ x < ... < ma = s i; i = l,...,l. 

Definition 2.6.3 Let A e F nx ™ satisfy the assumptions of Theorem 
2.6.2. The matrix J in (2.6.5) is called the Jordan canonical form of A. 
Let T e Horn (V) and assume that its minimal polynomial splits over F. 
Then a representation matrix J (2.6.5) is called the Jordan canonical form 
ofT. 

Remark 2.6.4 Let A £ F nxn and suppose that the minimal polynomial 
ip of A does not split over F. Then there exits a finite extension K of F 
such that if) splits over K. Then (2.6.5) holds for some P <G GL(n,K). J 
is referred as the Jordan canonical form of A. 

Corollary 2.6.5 Let A e F" x ™. Assume that the minimal polynomial 
of A is given by (2.5.11). Let J be the Jordan canonical form of A given by 
(2.6.5). Set 



(2.6.7) 



rn 



iqi + l 



mi 



0, 



1, 



Then the elementary polynomials of xi — A, which are the elementary 
visors of xi — A defined in Problem 2, are 



(2.6.8) 



tf = {x- ai ) mi \ j = l, 



1, ...,/. 
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Hence the invariant polynomials ii(x), i n {x) of xl — A are 

i 

(2.6.9) i r (x) = Y[{x-a i ) mi <- n -'-+ 1 \ r = l,...,n. 

»=i 

The above Corollary shows that the Jordan canonical form is unique up 
to a permutation of Jordan blocks. 

Problems 

1. Show directly that to each eigenvalue Ao of a companion matrix 
C(p) € F nx ™ corresponds one dimensional eigenvalues subspace spanned 
by the vector (1, Ao, A„, Ao _1 ) T . 

2. Let A £ F" x ™ and assume that the minimal polynomial of A splits in 
F. Let Ui , U 2 C F™ be the subspaces of all generalized eigenvectors of 
A, A T respectively corresponding to A € spec (A). Show that there 
exists bases xi,...,x TO and yi,...,y m m Ui and U2 respectively so 
that 

yJx j =S ij , i,j = l,...,m. 
(Hint: Assume first that A is in its Jordan canonical form.) 

3. Let A e¥ nxn . Let A,/x E F be two distinct eigenvalues of A. Let 
x, y e F n be two generalized eigenvectors of A, A T corresponding to 
A, \i respectively. Show that y T x = 0. 

4. Verify directly that J (given in (2.6.5)) annihilates its characteristic 
polynomial. Using the fact that any A <G F nx ™ is similar to its Jordan 
canonical form over the finite extension field K of F deduce the Cayley- 
Hamilton theorem. 

5. Let A, B e F" xn . Show that Aa:Bif and only if A and B have the 
same Jordan canonical form. 

2.7 Some applications of Jordan canonical form 

Definition 2.7.1 Let A e F" x ™ and assume that det (xl — A) splits in 
F. Let A be an eigenvalue of A. Then the number of factors of the form 
x — Ao appearing in the minimal polynomial ^(x) of A is called the index of 
Ao and is denoted by index Ao. The dimension of the eigenvalue subspace 
of A corresponding to Ao is called the geometric multiplicity of A . 

Using the results of the previous section we obtain. 
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Lemma 2.7.2 Let the assumptions of Definition 2.7.1 hold. Then index Ao 
is the size of the largest Jordan block corresponding to Ao, and the geometric 
multiplicity of A is the number of the Jordan blocks corresponding to A . 

Let T € Horn (V), Ao € spec (T) and consider the invariant subspaces 

(2.7.1) X r = {xeV: (A / - T) r x = 0}, r = 0,l,..., 
Y r = (AoJ - T) r V, r = 0,l,.... 

Theorem 2.7.3 Le£ T e Horn (V) and assume that Ao is the eigen- 
value ofT. Let index A = mi > m 2 > ... > m p > 1 &e i/ie dimensions of 
all Jordan blocks corresponding to A which appear in the Jordan canonical 
form ofT. Then 

p 

(2.7.2) dim X r = min(r, mi), r = 0, 1, 

i=l 

dim Y r = dim V — dim X r , r = 0, 1, ... 

In particular 

[0] - x c Xl c x 2 c ... c Xm , 

X(A ) := X m = X m+ i = m = index A . 

(2.7.3) V = Y„3Y 1 3Y 2 3...3Y m , 
Y(Ao) := Y m = Y m+ i = ... 

V = X(Ao)®Y(A ). 

Let 

(2.7.4) Vi = dim Xj — dim Xj_i, i = 1, m + 1, m := index Ao. 

TTien ^ is </ie number of Jordan block of size i at least corresponding to A . 
in particular 

(2.7.5) z/i > ^ 2 > ••• > v m > Vm+i = 0. 
Furthermore 

(2.7.6) Vi — Vi+i is the number of Jordan blocks of order 

i in the Jordan canonical form of Tcorresponding to Ao- 

Proof. Assume first that dct (xl - T) = ip(x) = (x - A ) m . That is T 
has one Jordan block of order m corresponding to Ao. Then the theorem 
follows straightforward. Observe next that for 

Kcr(A/-T) = 0, Range Kcr (XI - T) = V, A ^ A . 
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Assume now that det (xl — T) splits in F and V has the decomposition 
(2.6.1). Apply the above arguments to each T|W, for i = 1, q to deduce 
the theorem in this case. In the general case, where det (xl — T) does not 
split to linear factors, use the rational canonical form of T to deduce the 
theorem. □ 

Thus (2.7.3) gives yet another characterization of the index Ao- Note 
that in view of Definition 2.5.1 each O^xe X& is a generalized eigenvector 
of T. The sequence (2.7.4) is called the Weyr sequence corresponding to 
Ao- 

Definition 2.7. A A transformation T G Horn (V) is diagonable if there 
exists a basis in V which consists entirely of eigenvectors ofT. That is any 
representation matrix A ofTis diagonable, i.e. A is similar to a diagonal 
matrix. 

For such T we have that Xi = X mi for each A G spec (T). Theorem 
2.6.1 yields. 

Theorem 2.7.5 Let T G Horn (V). Then T is diagonable if and only 
if the minimal polynomial ip of T splits to linear, pairwise different factors. 
That is the index of any eigenvalue of T equals to 1. 

Definition 2.7.6 Let M be a D-module and let T g Horn (M). T is 
nilpotent if T s — for some positive integer s. 

Let T G Horn (V) and assume that det (xl — T) splits in F. For A G 
spec (T) let X(A ) C V be the T-invariant subspace defined in (2.7.3). 
Then the decomposition (2.6.1) yields the spectral decomposition of V: 

(2.7.7) V = Aespcc(T) X(A). 

The above decomposition is courser then the fine decomposition (2.6.1). 
The advantage of the spectral decomposition is that it is uniquely defined. 
Note that each X(A), A G spec (T) is direct sum of irreducible T-invariant 
subspaces corresponding to the eigenvalue A in the decomposition (2.6.1). 
Clearly T — A/|X(A) is a nilpotent operator. In the following theorem we 
address the problem of the choices of irreducible invariant subspaces in the 
decomposition (2.6.1) for a nilpotent transformation T. 

Theorem 2.7.7 Let T G Horn (V) be nilpotent. Let index = m = 
mi > m>2 > ... > m p > 1 be the dimensions of all Jordan blocks appearing 
in the Jordan canonical form of T . Let (2.6.1) be a decomposition o/V to 
a direct sum of irreducible T-invariant subspaces such that 

dim Wi = mi > dim W 2 = m 2 > ... > dim W q = m q > 1, 
(2.7.8) 

mi = ... = m ix > m il+ i = ... = m i2 > ... > m lp _ 1+1 = ... = m ip = m q . 
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Assume that each Wj has a basis y^i, ...,yj ;TOi satisfying (2.6.2), with Xq = 
0. Let Xj, Yi,i = 0, ... 6e defined as in (2.7.1) /or A = 0. Then the above 
bases in Wi, W 9 con &e chosen recursively as follows: 

(a) yi,i, ...,yi 1: i is an arbitrary basis in Y m _i . 

(b) Lei 1 < k < m. Assume that yij are given for all I such that 
mi > m — fc + 1 and a// j smc/i ifta£ 1 < j < mi — m + k. Then each 
y;,(fc+i) * s an 2/ element in T yi^ n Y TO _fc_i, which is a coset of the 
subspace Y m -k-i fl Xi = Ker T|Y TO _fc_i. If m — k = m t for some 
1 < t < i p then yi t _ 1 +i i, ■ •■,yi t ,i is any set of linearly independent vectors 
in Y TO _fc_i nXi, which complements the above chosen vectors yij, mi > 
m — k + 1, mi — m + k + 1 > j to a basis in Y m _fe_i. 

See Problem 1 for the proof of the Theorem. 

Corollary 2.7.8 Le£ the assumptions of Theorem 2.7.7 hold. Suppose 
furthermore that Z C V is an eigenspace ofT. Then there exists a decompo- 
sition (2.6.1) o/V to a direct sum of irreducible T -invariant subspaces such 
that Z has a basis consisting of I = dim Z eigenvectors of the restrictions 
0fT\W h ,. . .,T\W h for I • ./, • .... Ji < q. 

Proof. Let Z± := Zfl Y m _i C ... C Z m := ZnY and denote 
li = dim Zj for i = 1, . . . , m. We then construct bases in Wi, . . . , W q 
as in Theorem 2.7.7 in the following way. If l\ = dim Zi > we pick 
yi.ii • • • i y/i,i t° be from Zi. In general, for each k = 1, . . . , m— 1, 1 < t < z p 
and m t = m-k such that ? fc+1 > we let y"i t _ 1+ i,i, — , y» t _ 1 +j fc+1 -i fc ,i be 
any set of linearly independent vectors in Zk+i, which form a basis in 
Zfc + i/Z fc . □ 

Problems 

1. Prove Theorem 2.7.7 

2.8 The matrix equation AX - XB = 

Let A, B £ D nx ". A possible way to determine if A and B are similar over 
GL(n,D) is to consider the matrix equation 

(2.8.1) AX-XB = 0. 

Then A w B if and only if there exists a solution X E D nx ™ such that 
det X is an invcrtiblc clement in D. For X e D mx ™ let X e D" m be the 
column vector composed of the n columns of X: 

(2.8.2) X = (sn , x m \, Xi2j •••) x m 2, x m ^ n _ \^ , x\ n , x mn ) , 
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where X = [x i3 ] £ D mx ™. Then the equation (2.8.1), where A £ D mxm , B £ 
D nxra , has a simple form in tensor notation [MaM64]. (See also Problems 
1 and 2.8.13.) 

(2.8.3) (I® A- B T <g> I)X = 0. 

Assume that D is a Bezout domain. Then the set of all X £ D nx ™ satisfying 
(2.8.1) forms a D-module with a basis X\, ...,X V , (Theorem 1.12.3). So any 
matrix X which satisfies (2.8.1) is of the form 

V 

X = '^^x i X il Xi£D, i = l,...,u. 

It is "left" to find whether a function 

S(x 1 , x v ) := det x, L Xi) 

i=l 

has an invertible value. In such a generality this is a difficult problem. A 
more modest task is to find the value of v and to determine if 6(xi, ...,x v ) 
vanish identically. For that purpose it is enough to assume that D is actually 
a field F (for example the quotient field of D) . Also we may replace F by a 
finite extension field K in which the characteristic polynomial of A and B 
split. Finally we are going to study the equation (2.8.1) where 

A £ K mxm , B £ K" x ", X £ K mx ™. 

Let ip{x), <j>{x) and J, K be the minimal polynomials and the Jordan 
canonical forms of A, B respectively. 

i 

*P( X ) = ]J( X - A *) s % spec (A) = {Ai,...,Ai}, 

»=i 
k 

<i>{x) = ~ VjY*, s P ec ( B ) = -,Mk}, 
i=i 

P AP = J = 0^ = i«/i, 

(2.8.4) 

Ji = e** = i(Ai/ mir + -ff mir ), 1 < m 4<3s < ... < mil = Si , i = l,...,l, 
Q- 1 BQ^K = ® k J=1 K J , 

K j =®% 1 (fi j I n . r +H n . r ), l<n jPj <...<n jl =t j , j = l,...,k. 
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Let Y = P- 1 XQ. Then the system (2.8.1) is equivalent to JY - YK = 0. 
Partition Y according to the partitions of J and K as given in (2.8.4). So 

Y=(Y ij ), r„-eK m * x % 

Qi Pi 

mi = ^ ' nii r , rij = ^ ' fijrt i — 1, I, j = 1, k. 

r—l r—1 

Then the matrix equation for Y reduces to Ik matrix equations 

(2.8.5) JiY i:j - YijKj = 0, i = l, j = l,...,k. 
The following two lemmas analyze the above matrix equations. 

Lemma 2.8.1 Let i G [I], j G [k]. If A, 7^ /Uj £/ien £/ie corresponding 
matrix equation in (2.8.5) has the unique trivial solution Y^ = 0. 

Proof. Let 

Kj = fijl nj + Kj , K 3 = ® P r UH njr . 
Note that J u = K v = for u > mi and v > n 3 . Then (2.8.5) becomes 
(At — fij)Yij = —JYij + YijK. 

Thus 

(Aj — /ij) 2 Yij — — Ji(Xi — Hj)Yij + (Aj — Hj)YijKj = 

J% ( ^t ^ij "I - ^ij ) "I - ( ^t Yij Kj ) Kj 

(-Ji)%- + 2(-J i )Y ij K j + Y l3 K). 
Continuing this procedure we get 

(A, - N y Yij = £ ([) {-Ji) u YijKj~ u - 

Hence for r — mi + n 3 either J" or K r ~ u is a zero matrix. Since Aj 7^ /Lij 
we deduce that Y i3 = 0. □ 

Lemma 2.8.2 Let Z = [z a p] G F mx ™ satisfy the equation 

(2.8.6) if ro Z = Z#„. 
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Then the entries of Z are of the form 

z a fj = for (3 < a + n — min(m, n), 

(2.8.7) 

z af) = Z( a +i)(/M-i) for /3 > a + n - min(m, n). 

In particular, the subspace of all m x n matrices Z satisfying (2.8.6) has 
dimension min(m, n). 

Proof. Note that the first column and the last row of Hi are equal to 
zero. Hence the first column and the last row of ZH n = H m Z are equal to 
zero. That is 

z a i = Zm/3 = 0, a = 2, ...,m, (3 = 1, ...,n- 1. 

In all other cases, equating the (a, (3) entries of H m Z and ZH n we obtain 

z (a+i)p = z a(/3-i), a = l, m - 1, (3 = 2, ...,n. 

The above two sets of equalities yield (2.8.7). □ 
Combine the above two lemmas to obtain. 

Theorem 2.8.3 Consider the system of (2.8.5). If Xi ^ fij then Y^ = 
0. Assume that Xi = Partition Yij according to the partitions of Ji and 
Kj as given in (2.8.4): 

Ya = [Y};% Y^ v) g K"*» u=l, ...,q % , v = 1, ..., Pj . 

Then each Y^ v ^ is of the form given in Lemma 2.8.2 with m = m,i u and 
n = nj v . Assume that 

Xi ^i, i 1, t, 

(2.8.8) 

Xi ± i = t+ 1, ...,l, j = t+ 1, ...,k. 

Then the dimension of the subspace y C K mx ™ of block matrices Y = 
[5^j]i* = i satisfying (2.8.5) is given by the formula 

t qi ,Pi 

(2.8.9) dim^ = ^ ^ min(ra iu , n iv ). 

i—l u,v—l 
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Consider a special case of (2.8.1) 

(2.8.10) C(A) = {X eB nxn : AX - XA = 0}. 

Then C(A) is an algebra over D with the identity /. In case D is a field F, 
or more generally D is a Bczout domain, C'(A) has a finite basis. Theorem 
2.8.3 yields 

l Qi 

dimC( J 4)=^ ^2 mm(m iu ,m iv ). 

i—l u,v—l 

(Note that the dimension of C(A) does not change if we pass from F to an 
finite extension field K in which the characteristic polynomial of A splits.) 
As {m» u )^* =1 is a decreasing sequence we have 

9. 9. 

^2 min(m iu , m iv ) = um iu + 

v—1 v=u+l 

So 

I q t 

(2.8.11) dim C{A) = J2J2( 2u ~ l ) m ™- 

i—l u—1 

Let h(x), ...,i n (x) be the invariant polynomials of xI — A. Use (2.6.7-2.6.9) 
to deduce 

n 

(2.8.12) dim C{A) = ^(2m - l)deg i n - u+1 {x). 

u=l 

The above formula enables us to determine when any commuting matrix 
with A is a polynomial in A. Clearly the dimension of the subspace spanned 
by the powers of A is equal to the degree of the minimal polynomial of A. 

Corollary 2.8.4 Let A e jpnx« p Then each commuting matrix with 
A can be expressed as a polynomial in A if and only if the minimal and 
the characteristic polynomial of A are equal. That is, A is similar to a 
companion matrix C(p), where p(x) = det (xl — A). 

A matrix for which the minimal and characteristic polynomial coincide is 
called nonderogatory. If the minimal polynomial of A is a strict factor of the 
characteristic polynomial of A, i.e. the degree of the minimal polynomial 
is strictly less than the degree of the characteristic polynomial, then A is 
called derogatory. 

Problems 
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1. For A = [dij] E W nx P, B = (b k i) e D nx « let 
(2.8.13) A® B :— [ aij B] e D" mxp,? , 

be the tensor (Kronecker) product of A and B. Show 

{A 1 ^A 2 ){B 1 ^B 2 ) = (A 1 B 1 )®(A 2 B 2 ) 1 A t e B mtXnt , B, e 0™ iXp % * = 1,2. 

2. Let fj, : D mxn -> D mn be given by fj,(X) = X, where X is defined be 
(2.8.2). Show that 

H(AX) = {I n ®A)n{X), fi(XB) = (B T ®I m )v(X), A e O mxm , B e D" x 

3. Let P e F mxm , Q e F" x ", i? e F mx ". Let 

^ jp(m+n)x (m+n) 



which satisfies (2.8.1). Hence 



p 




, B = 


P 


0" 





Q 





Q. 



Assume that the characteristic polynomials of P and Q are coprimc. 

J m 



Show that there exists X = 



4. Let A = Yfi=i ®Ai e F" xn . Show that 

(2.8.14) dim C(A) > ^ dim C(A 4 ), 

i=l 

and the equality holds if and only if 

(det (a:/- Ai),dct (x/- A,-)) = 1 for i = !,...,£, j = 

5. Let A e D nx ™. Show that the ring C(A) is a commutative ring if 
and only if A satisfies the conditions of Corollary 2.8.4, where F is 
the quotient field of D. 

6. Let A e D" x ", B e C(A). Then B is an invertible element in the 
ring C(A) if and only if B is a unimodular matrix. 

7. Let A e D mxm , B e D nx ™. Define 

(2.8.15) C(A,B) := {X e D mxrl : iX-lB = 0}. 

Show that C(A, B) is a left (right) module of C(A) (C(B)) under the 
matrix multiplication. 
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8. Let A,B E D" xn . Show that A « B if and only if the following two 
conditions hold: 

(a) C(A, B) is a C( J 4)-module with a basis consisting of one element 

U; 

(b) any basis element U is a unimodular matrix. 

2.9 A criterion for similarity of two matrices 

Definition 2.9.1 Let A E D mxm , B E W ixn . Denote by r(A, B) and 
v(A, B) the rank and the nullity of the matrix I n ® A — B T ® I m viewed as 
a matrix acting on the vector space f mxn , where F is the quotient field of 
D. 

According to Theorem 2.8.3 we have 

v(A,B) = ^2 min(mj„,nj„), 

i— 1 u,v— 1 

(2.9.1) 

* qi,Pi 

r(A,B) = mn ^ min(m iu , n iv ). 

Z— 1 — 1 

Theorem 2.9.2 Let A E D mxm , B E D nxn . Then 
y{A,B)< 1 -{u{A,A) + v{B,B)). 

Equality holds if and only if m = n and A and B are similar over the 
quotient field F. 

Proof. Without loss of generality we may assume that D = F and the 
characteristic polynomials of A and B split over F[x]. For x,y E K let 
min(x,y) (max(x,y)) be the minimum (maximum) of the values of x and 
y. Clearly min(x, y) is a homogeneous concave function on M 2 . Hence 

. „ „. . . , min(a, c) + min(6, d) + min(a, d) + min(6, c) 
(2.9.2) min(a + b, c + d) > 2 ' 

A straightforward calculation shows that if a = c and b = d then equality 
holds if and only if a — b. Let 

N = max(m, n), m iu = nj V = 0, 

for qi < u < N, pi < v < N, i = 1, £, j = 1, k. 
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Then 

£,N k,N 

v(A, A) + v(B, B)= ( 2u ~ l ) m ^ + J2( 2u ~ 1 ) n J" - 

i,u— 1 j,u— 1 

t,N 

^2 {1u-l)(m iu + n iu ). 
and the equality holds if and £ = k = t. Next consider the inequality 

t,N t N 

^ (2u - l)(mj„ + n iu ) = ^2^2 min ( TO i« + n iu, m iv + n iv ) > 

i,u—l i—1 u,v—l 

^ t N 

2^2 ^2 (min(m iu , m iv ) + min(m m , n iv ) + 

i—l u,v—l 

mm(n iu , m iv ) + mm(n iu , n iv )) = 

j t,N t qi,Pi 

- ^2(2u-l)(m lu + n lu ) + ^2 ^2 ™ n ( m iu,niv)- 

i : u—l i — 1 

Combine the above results to obtain the inequality (2.9.2). Equality sign 
holds in (2.9.2) if and only if A and B have the same Jordan canonical 
forms. That is m — n and A is similar to B over F. □ 

Suppose that A B. Hence (2.2.1) holds. The rules for the tensor 
product (Problem 2.8.1) imply 

I®A-B T ®I= ((Q T )~ 1 ® /)(/ ® A - A T ® I)(Q T ® I), 

(2.9.3) 

I®A-B T ®I= ((Q T )" 1 O Q)(J ® A - A T ® 7)(Q T ® Q" 1 ). 
Hence the three matrices 

(2.9.4) 7(g)A-A T «)/, J ® A - B T ® 7, I ® B - B T ® I 

are similar. In particular, these matrices are equivalent. Over a field F 
the above matrices are equivalent if and only if the have the same nullity. 
Hence Theorem 2.9.2 yields. 

Theorem 2.9.3 Let A,B e ¥ nxn . Then A and B are similar if and 
only if the three matrices in (2.9.4) are equivalent. 

The obvious part of Theorem 2.9.3 extends trivially to any integral domain 
D. 
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Proposition 2.9.4 Let A,BG D nXTl . If A and B are similar over D 
then the three matrices in (2.9.4) are equivalent over D. 

However, this condition is not sufficient for the similarity of A and B even 
in the case O = Z. (Sec Problem 1.) The disadvantage of the similarity 
criterion stated in Theorem 2.9.3 is due to the appearance of the matrix 
I ® A — B T ® I, which depends on A and B. It is interesting to note that 
the equivalence of just two matrices in (2.9.4) does not imply the similarity 
of A and B. Indeed 

I <8> A - A T ® I = I ® (A + XI) - {A + XI) T ® I 

for any A E F. If F has an infinite characteristic then A ^ A + XI for 
any A ^ 0. (Problem 2.) Also if A = H n and B = then v(A,A) = 
v(A, B) = n. (Problem 3.) however, under certain assumptions the equality 
v{A, A) = v(A, B) implies Ak B. 

Theorem 2.9.5 Let A e C" x ™. Then there exists a neighborhood of 
A = [oij] 

n 

(2.9.5) D(A,p) := {B = [&y] e C" xn : £ \b l3 - a l3 \ 2 < p 2 }, 

for some positive p depending on A, such that if 

(2.9.6) v(A, A) = u(A, B), B e D(A, p), 
then B is similar to A. 

Proof. Let r be the rank of J ® A — A T ® /. So there exist indices 
a = {(aii,a 2 i), (ai r , a 2r )}, = {(/?n, /?2i), (/3ir,/?2r)} C [n] x [n], 

viewed as elements of [n 2 ] r , such that det (7 ® A — A T ® 7) [a, /3] 7^ 0. Also 
det (J(Xi^4 — A T ^I)[j, S] = for any 7,(5 e [n 2 ] r+1 . First choose a positive 
p' such that 

(2.9.7) det (I® A- B T ® I)[a,(3] ^0,forallBG D(A,p'). 

Consider the system (2.8.1) as a system in n 2 variables, which are the 
entries of X = [xij]". In the system (2.8.1) consider the subsystem of r 
equations corresponding to the set a: 

n 

(2.9.8) ^ aikXkj ~ Xjkbkj = 0, i = a lp , j = a 2p , p = 1, ...,r. 
fe=i 
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Let 

(2.9.9) x k j = 5 k j for (k, j) ^ (/3 lp , f3 2p ), p = 1, r. 

The condition (2.9.7) yields that the system (2.9.8)-(2.9.9) has a unique 
solution X(B) for any B G D{A,p'). Also X(A) = I. Use the continuity 
argument to deduce the the existence of p G (0, p'\ so that det X(B) ^ for 
all B G D(A,p). Let V be the algebraic variety of all matrices B G C" x ™ 
satisfying 

(2.9.10) det (I ® A- B T <g> I)[j,6] = for any 7, 5 G Q( r +i),n 2 - 

We claim that V n Z?(A, p) is the set of matrices of the form (2.9.6). In- 
deed, let B G VnD(A,p). Then (2.9.7) and (2.9.10) yield that v(A, B) = 
v{A,A) = n 2 -r. Assume that B satisfies (2.9.6). Hence (2.9.10) holds 
and Be Vfl D(A, p). Assume that B G V n D(A, p). Then 

AX(B) — X(B)B = 0, dctX(B)^0 => AkB. 

□ 

Problems 

1. Show that for A and B given in Problem 2.2.1 the three matrices in 
(2.9.4) are equivalent over Z, but A and B are not similar over Z. 
(See Problem 2.2.1.) 

2. Show that if F has an infinite characteristic then for any A G F nxra 
A w A + XI if and only if A = 0. (Compare the traces of A and 
i4 + XI.) 

3. Show that if A = H n and B = then u(A, A) = v{A, B) = n. 

4. Let A,B G D" x ™. Assume that the three matrices in (2.9.4) are 
equivalent. Let X be a maximal ideal in D. Let F = D/I and view 
A, B as matrices over F. Prove that A and B similar over F. (Show 
that the matrices in (2.9.4) are equivalent over F.) 

2.10 The matrix equation AX - XB = C 

A related equation to (2.8.1) is the nonhomogeneous equation 
(2.10.1) AX - XB = C, A G F mxm , B G F" xn , C, X e ¥ mxn . 
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This equation can be written in the tensor notation as 

(2.10.2) (I„® A- B T ®I m )X = C. 

The necessary and sufficient condition for the solvability of (2.10.2) can 
be stated in the dual form as follows. Consider the homogenous system 
whose coefficient matrix is the transposed coefficient matrix of (2.10.2), 
(see Problem 1), 

(/„ <8> A T - B ® I m )Y = 0. 

Then (2.10.2) is solvable if and only if any solution Y of the above system 
is orthogonal to C (e.g. Problem 2). In matrix form the above equation is 
equivalent to 

A T Y-YB T =0, Ye¥ mxn . 

The orthogonality of Y and C are written as trY T C = 0. (Sec Problem 
3.) Thus we showed: 

Theorem 2.10.1 Let A E F mxm , B E F" xn . Then (2.10.1) is solvable 
if and only if 

(2.10.3) trZC = 
for all Z E¥ mxn satisfying 

(2.10.4) ZA-BZ = 0. 

Using the above Theorem we can obtain a stronger version of Problem 4. 
Theorem 2.10.2 Let 

G = [Gy]i, Gij E F"- x % Gij - for j < i, i,j = 1, ...,t. 

Then 

£ 

(2.10.5) dim C(G) > dim C{G U ). 



i=l 

Proof. Consider first the case 1 = 2. Let G 
commutes with G. So 



A E 
B 



. Assume that 



U X 
V 



(2.10.6) AU-UA = 0, BV-VB = 0, AX - XB = UE - EV. 
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Theorem 2.10.1 implies that U G C(A), V € C(B) satisfy the last equation 
of (2.10.6) if and only if tr Z(UE - EV) = for all Z satisfying (2.10.4). 
Thus the dimension of pairs (U, V) satisfying (2.10.6) is at least 

dim C(A) + dim C{B) - dim C(B, A). 

On the other hand, for a fixed (U,V) satisfying (2.10.6), the set of all X 
satisfying the last equation of (2.10.6) is of the form X + C(A,B). The 
equality (2.8.9) yields dim C(A, B) = dim C(B, A). Hence (2.10.5) holds 
for 1 = 2. The general case follows straightforward by induction on £. □ 

We remark that contrary to the results given in Problem 2.8.4 the equal- 
ity in (2.10.5) may occur even if Gu — Gjj for some i ^ j. (See Problem 
4.) 

Theorem 2.10.3 Let A E F roxm , B e F" x ", C e ¥ mxn . Let 



F = 



A 


0" 


, G = 


A 


C 





B 





B 



(Z ]p(m+n)x(m+n) 



Show that F 



G if and only if the matrix equation (2.10.1) is solvable. 

Im X 



Proof. Assume that (2.10.1) solvable. Then U - 







e GL(m + 



n,F) and G = U~ 1 FU. 

Assume now that F w G. We prove the solvability of (2.10.1)) by 
induction on m + n, where m,n> 1. Let K be a finite extension of F such 
that the characteristic polynomial of A and B split to linear factors. Clearly 
it is enough to prove the solvability of (2.10.1) for the field K. Suppose first 
that A and B do not have common eigenvalues. Then Problem 2.8.3 yields 
that F w G. Assume now that A and B have a common eigenvalue Ai. 
For m = n = 1 it means that A = B = Ai e F. Then the assumption that 
F w G implies that C = and (2.10.1) is solvable with X = 0. 

Assume now that the theorem holds for all 2 < m + n < L. Let m + 
n = L. The above arguments yield that it is enough to consider the case 
where the characteristic polynomials of A and B split to linear factors 
and Ai is a common eigenvalue of A and B. By considering the matrices 
F — Xil m+n , G — \\I m +n we m ay assume that is an eigenvalue of A 
and B. By considering the similar matrices U~ 1 FU 1 U~ 1 GU where U = 
Ui®U 2 , Ui e GL(ra,F), U 2 e GL(n,F) we may assume that A and B 
are in a Jordan canonical form of the form 

A = A 1 ®A 2l B = BiffiBa, A™ = 0, B[ l = 0, g (spec (A 2 )Uspec (B 2 )). 
(It is possible that either A — A\ or B = i?i.) Let 



U = 





A" 


, x = 


' 


A12 


, C = 


Gu C\2 





In_ 


A 2 i 





p21 C*22 
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Use Problem 2.8.3 to deduce that one can choose Xi 2 ,X 2 i such that G' 
A C'~ 

U- 1 GU = G= „ „ 

we will assume that C\2 = 0, C21 — 0. Permute second and third blocks in 
F, G to obtain that F, G are permutationally similar to 



and C[ 2 = 0, C 21 — 0. For simplicity of notation 



Ai 

B 1 

A 2 









B 2 



G 



A x C n 

B l 










A 2 







C22 
B 2 



respectively. So the Jordan canonical form of F, G corresponding to are 



A 1 
B 1 



A l C n 
Bi 



re- 



determined by the Jordan canonical forms of 

spectivcly. The Jordan canonical form of F, G corresponding to other eigen- 
values are determined by the Jordan canonical forms of 
respectively. Hence 



~A 2 


" 




~A 2 


C 22 





B 2 


7 





B 2 



~A 1 


" 




'A, 


C n 




~A 2 


" 




~A 2 


G 22 



















B 2 







B 2 



Thus if either A or B are not nilpotent the theorem follows by induction. 

It is left to consider the case where A and B are nilpotent matrices, 
which arc in their Jordan canonical form. If A = 0, B = then C = and 
the theorem follows. So we assume that cither at least one of the matrices 
in {A,B} is not a zero matrix. Since dim kerF = dim kcrG Problem 
6 yields that (after the upper triangular similarity applied to G) we may 
assume that ker F — kcr G. Let 

A = ®i = \Ai, B = (B'j—iBj, 

where each A\, Bj is an upper triangular Jordan block of dimension mj, nj 
respectively. Let 

C = [Ctj] 



Cn G 



i = 1, -,P,j = L -,Q, 



be the block partition of C induced by the block partition of A and B re- 
spectively. The assumption that ker F = ker G is equivalent to the assump- 
tion that the first column of each dj is zero. Consider V = F m+ ™/ker_F. 
Then F, G induce the operators F, G on V which are obtained from F, G 
by deleting the rows and columns corresponding to the vectors in the ker- 
nels of A and B. (These vectors are formed by some of the vectors in the 
canonical basis of ¥ m+n .) Note that the Jordan canonical forms of F, G are 



2.10. THE MATRIX EQUATION AX-XB = C 



91 



direct sums of reduced Jordan blocks (obtained by deleting the first row 
and column in each Jordan block) corresponding to F, G respectively. As 
F and G have the same Jordan blocks it follows that F, G have the same 
Jordan blocks, i.e. F ss G. It is easy to see that 



F = 



A 



A 
B 



i Ai 



G 



D 



A, e F< I; 



-l)x(r; 



A C 
B 



C — (Cij), 



-V, Bj e f^- 1 ^^- 1 ), Cij e F ( 



mr l)x(n,j— 1) 



Here . I,. //, . C ', , obtained from Ai,Bj,Cij be deleting the first row and 
column respectively. Since F ~ G we can use the induction hypothesis. 
That is there exists X = (Xij) e F mx ™ partitioned as C with the following 
properties: The first row and the column of each X^ is zero. AiXij — 

1 rows and in the first column. 

we already may assume that 



XijBj- 



Cij have zero entries in the last m,; 



By considering U 1 GU with U 



the last m,- 







X 
F 



1 rows and the first column of each dj are zero. Finally we 



observe that if A4 and Bj are Jordan blocks that the equation (2.10.1) is 
solvable by letting X^ be a corresponding matrix with the last rrii — 1 rows 
equal to zero. □ 



Problems 

1. Let A(g>B be defined as in (2.8.13). Prove that (A® B) T = A T ® B T . 

2. Consider the system 

Ax = 6, A e F mx ", b e F™. 

Show the above system is solvable if and only any solution of A T y = 
satisfies y T b = 0. (Change variables to bring A to its diagonal form 
as in §1.12.) 

3. Let X,Y e D mx ™. Let n(X) , fj,(Y) e D mn be defined as in Problem 
2.8.1. Show that 

fi(X) T fi(Y) =tiY T X. 

4. Assume in Theorem 2.10.2 I = 2, G n = G 22 = 0, G 12 = F Show 
that in this case the equality sign holds in (2.10.5). 
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5. Let Ai e F" iXni , i = 1,2 and suppose that Ai and A 2 do not have a 
common eigenvalue. Assume that A = A\ A 2 . Let 

C = [CaW, X = [XaW, C&Xij e F"' x "> , i, j = 1,2. 

Using Problem 2.8.4 prove that the equation AX—XA = C is solvable 
if and only if the equations AiXu — XaAi — Cu, i = 1, 2 are solvable. 

6. Let A e F mxm , B e F" x ™ be two nilpotent matrix. Let C E ¥ mxn 
and define the matrices F,Ge F (m+n)x(m+™) ag in x^corcm 2.10.3. 
Show that dim ker F > dim ker G. Equality holds if and only if 
Ckcri? C Range A. Equivalcntly, equality holds if and only if there 
exists X e¥ rnxn such that 

kcrF = ker C/^GJ/, U = 

2.11 A case of two nilpotent matrices 

Theorem 2.11.1 Let T 6 Horn (V) be nilpotent. Let index = m = 
mi > m 2 > ... > m p > 1 be the dimensions of all Jordan blocks appearing 
in the Jordan canonical form of T. Let Z C V be an eigenspace of T 
corresponding to the eigenvalue 0. Denote by W = V/Z. Then T induces 
a nilpotent operator T' € Horn (W). The dimension of Jordan blocks ofT' 
correspond to the positive integers in the sequence m!-y,m 2 , ■ ■ ■ ,tn' p , where 

is either nii orm,i — l. Furthermore, exactly dim Z of indices of m\ are 
equal to — 1. 

Proof. Suppose first that p = 1, i.e. W is an irreducible invariant 
subspace of T. Then Z is the eigenspace of T and the theorem is straight- 
forward. Use Corollary 2.7.8 in the general case to deduce the theorem. □ 

Theorem 2.11.2 Let A G F" x ™ be a nilpotent matrix. Put 
X fe = {xeF": A k x = 0} 7 k = 0,... 

Then 

(2.11.1) X = {0} and X; C X i+1 , i = 0, ... 

Assume that 



X 

In 



X, 7^ Xj + i for i = 0, ...,p — 1, and X p = F", p = index 0. 
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Suppose that B G F" xn satisfies 

(2.11.2) BX i+1 c Xj, fori = l,...,p-l. 
Then B is nilpotent and 

(2.11.3) v(A, A) < v{B,B). 
Equality holds if and only if B is similar to A. 

Proof. Clearly (2.11.2) holds for any A G F" x ™. As B p ¥ n = BPX p C 
X = {0}, it follows that B is nilpotent. We prove the claim by induction 
on p. For p=lA=B=0 and equality holds in (2.11.3). Suppose that 
the theorem holds for p = q — 1. Let p = q. 

Assume that the Jordan blocks of A and B are the sizes q = mi > . . . > 
mj > 1 and l\ > . . . > Ik > 1 respectively. Recall that Xi is the eigenspace 
of A corresponding to A = 0. Hence j = dim Xi. Since B~Ki = X = {0} 
it follows that the dimension of the eigenspace of B is at least j. Hence 
k > j. 

Let W := V/Xi. Since AX.\ = {0} A induces a nilpotent operator 
A 1 G Horn (W). Let X^ = ker(A')%i = 1, . . .. Then X^ = X i+ i/Xi,i = 
0, 1, . . .. Hence the index of A' = q — 1. Furthermore the Jordan blocks 
of A' correspond to the positive numbers in the sequence m[ — m\ — \ > 
. . . > m'j = rrij — 1. Since _BXi = {0} it follows that B induces the operator 
B' G Horn (W). The equality X' t = X j+ i/Xi implies that B'XJ C X^_ : 
for i = 1, . . .. 

Theorem 2.11.1 implies that the Jordan blocks of B' correspond to 
nonzero l' x ,...,V k , where l\ is either Zj or k — 1. Furthermore exactly j 
of l\ are equal to k — 1. Recall (2.8.11) that 

j j j 

A) - ]T(2z - l)m 4 = £(2* - IK + £(2* - 1) = ^(^', A') + j 2 . 

i— 1 i— 1 i— 1 

Assume that 

'i = - - - = ifei > ^fei+i = ■ • • = h 2 > h 2 +i — ■ ■ > h r -i+i = ■ ■ ■ = h r , 

where k r = k. let ko = 0. Suppose that in the set of {fc^-i + 1, . . . , k s } we 
have exactly i < k s — k s -i indices such that l' r = l r — 1 for r G {fc s _i + 
1 , . . . , k s } . We then assume that l' r = l r — 1 for r = k s , k s — 1 , . . . , k s — i + 1 . 
Hence l[ > . . . > l' k > 0. Thus u(B', B') = T,Li( 2i - So 



k k 

v{B, B) = ^(2z - l)h = v(B', B') + ]T(2z - l)(h - I'A > v(B' , B') + j 2 . 

i=l i=l 
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equality holds if and only if l\ = k — 1 for i = The induc- 

tion hypothesis implies that v{A\ A') < v(B' ', B') and equality holds if 
and only if A' ~ B', i.e A' and B' have the same Jordan blocks. Hence 
v(A, A) < v{B, B) and equality holds if and only if A ~ B. □ 



2.12 Historical remarks 

The exposition of §2.1 is close to [Gan59]. The content of §2.2 is standard. 
Theorem 2.3.4 is well known [Gan59]. Other results of §2.3 are not common 
and some of them may be new. §2.4 is standard and its exposition is close 
to [Gan59]. Theorem 2.5.4 is probably known for D e d (see [Lea48] for the 
case D = H(f2), il C C.) Perhaps it is new for Bezout domains. The 
results of §2.6 are standard. Most of §2.7 is standard. The exposition of 
§2.8 is close to [Gan59]. For additional properties of tensor product see 
[MaM64] . Problem 2.8.8 is close to the results of [Fad66]. See also [Gur80] 
for an arbitrary integral domain D. Theorems 2.9.2 and 2.9.3 are taken 
from [Fri80b]. See [GaB77] for a weaker version of Theorem 2.9.3. Some of 
the results of §2.10 may be new. Theorem 2.10.1 was taken from [Fri80a]. 
Theorem 2.10.3 is called Roth's theorem [Rot 52]. Theorem 2.11.2 is taken 
from [Fri80b]. 



Chapter 3 

Functions of Matrices and 
Analytic Similarity 

3.1 Components of a matrix and functions of 
matrices 

In this Chapter we assume that all the matrices are complex valued (F = C) 
unless otherwise stated. Let <f>(x) be a polynomial (<j) € C[x]). The following 
relations are easily established 

<j)(B) = P<f>(A)p-\ B = PAP' 1 , A,Be C nxn , P e GL n (C), 

(3.1.1) 

e a 2 ) = <j>{A{) e 4>{A 2 ). 

It often pays to know the explicit formula for <f>(A) in terms of the Jordan 
canonical form of A. In view of (3.1.1) it is enough to consider the case 
where J is composed of one Jordan block. 

Lemma 3.1.1 Let J = X I + H e C nxn , where H = H n . Then for 
any <fi G C[x] 

fe=0 

Proof. For any <p we have the Taylor expansion 

4>{x) = Y j '^-^{x-\ )\ 7V = max(deg0,n). 

k=0 
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As H 1 = for I > n from the above equality we deduce the lemma. □ 
Using the Jordan canonical form of A we obtain. 

Theorem 3.1.2 Let A e C nxn . Assume that the Jordan canonical 
form of A is given by (2.6.5). Then for (f> e C[x] we have 

(3.1.2) 4>{A)=P(®U®%, E ^^H k mij ) P -\ 

k=0 

Definition 3.1.3 Let the assumptions of Theorem 3.1.2 hold. Then 
Zik = Zik(A) is called the (i, k) component of A and is given by 

z ik = p(o e ... e o ®f =1 n k mi . © o... e o)p-\ 

(3.1.3) 

k = 0, Si — 1, Si — run, i = 1, ...,£. 
Compare (3.1.2) with (3.1.3) to deduce 

(3.1.4) ^) = EE^ 

»=i j=o 

Definition 3.1.4 Let A e c™ x ™ an d assume that ft C C contains 
spec (A). Then for <j> e H(fi) cfe/tne 0(A) 6j/ (3.1.4). 

Using (3.1.3) it is easy verify that the components of A satisfy 

Z,ij, i = l, ■■■,£, j = 1, Si — 1, are linearly independent, 
ZijZ pq = if cither i ^ p, or i = p and j + q > Si, 

(3.1.5) 

ZijZ iq = Z i{j+q) , for j + q<Si-l, 

i 

A = P{Y^iZio + Z a )P-\ 

Consider the component Zii Si _\\. The above relations imply 

(3.1.6) AZj( s ._!) = Zi( s ._^A = \iZi( s ._iy 

Thus the nonzero columns of Z^.^, Zj^ s _^ are the eigenvectors of 
A, A T respectively corresponding to A^. (Note that Zj( Sj _i) ^ 0.) 
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Lemma 3.1.5 Let A £ C nxn . Assume that Aj is an eigenvalue of A. 
Let X, be the generalized eigenspace of A corresponding to \ : 

(3.1.7) Xi = {xeC n : (A;/ - A) Si x = 0}. 
Then 

(3.1.8) rank Z^.^) = dim (AJ - A) s '- 1 X i . 

Proof. It is enough to assume that A is in its Jordan form. Then X, 
is the subspace of all x = (x\, x n ) T , where the first YTp=i Y^jLi m vi co ~ 
ordinates and the last Y^ P =i+i m pj coordinates vanish. So (A^J — 

A) Si_1 Xi contains only those eigenvectors which correspond to Jordan 
blocks of the length s;. Clearly, the rank of Zn^ s ._i) is exactly the number 
of such blocks. □ 

Definition 3.1.6 Let A £ C nxn . Then the spectral radius p(A), the 
peripheral spectrum spec peri (A) and the index A of A are given by 

p(A) = max |A|, 

AGspcc (A) 

(3.1.9) spec pcri (A) - {A e spec (A) : |A| = p(A)}, 
index A = max index A. 

AGspoc p( , ri (A) 



Problems 

1. Let A £ C" XTl and let V € C[x] be the minimal polynomial of A. 
Assume that ft C C is an open set in C such that spec (A) C ft. Let 
4> £ H(O). Then the values 

(3.1.10) ^ (fc) (A), k = 0,..., index A- 1, A £ spec (A) 

are called the values of <f> on the spectrum of A. Two functions </>, 9 £ 
H(f2) are said to coincide on spec (A) if they have the same values on 
spec (A). Assume that G C[cc] and let 

= ujtp + 9, deg 9 < deg ip. 

Show that 9 coincide with <j> on spec (A) . Let 

6(x) , , 9(x) , . -J-^ 4^ an . , . 
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where ip is given by (2.5.11). Show that ay, j — S{,...,Si — p are 
determined recursively by 4>^\ j = 0, ...,p. (Multiply the above 
equality by ip(x) and evaluate this identity at Xi.) For any £ H(J1) 
define 8 by the equality 



e si 



(3.1.11) 0(z) =#*)££ 



The polynomial 9 is called the Lagrange-Sylvester (L-S) interpolation 
polynomial of (corresponding to ip). Prove that 

(3.1.12) <i>{A) = 6{A). 

Let 9j be the L-S polynomials of <fij £ H(f2) for j — 1,2. Show that 
#i#2 coincides with L-S polynomial of 0102 on spec (A). Use this fact 
to prove the identity 

(3.1.13) 4>i(A)4><2(A) = 0(A), = 0i02. 

2. Prove (3.1.13) by using the definition (3.1.4) and the relation (3.1.5). 

3. Let the assumptions of Problem 1 hold. Assume that a sequence 
{0m}i° C H(f2) converges to £ H(fi). That is {0 m }i° converges 
uniformly on any compact set of fi. Hence 



lim 0^(A) = (j) (A), for any j £ Z+ and A e SI. 

m — >oo 

Use the definition (3.1.4) to show 

(3.1.14) lim cj) m {A) = 0(A). 

m — >oo 

Apply this result to prove 



00 jsm N jsr 



(3.1.15) e A = V ±j (= lim V 



ml w^oo ^— ; m! 

m— m— 



00 yjm 

(3.1.16) (AJ-A)- 1 ^ — T for |A| > p(A). 



A m+1 
m=0 
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3.2 Cesaro convergence of matrices 

Let 

(3.2.1) A k = [4f] G C mx ", fc = 0,l,... 

be a sequence of matrices. The p — th Cesaro sequence is defined as follows. 
First Ak o = Ak for each fceZ + . Then for p G N Ak p defined recursively 

by 

k 

(3.2.2) A k , P - [ag' p) ] := ^ E 4,M>-i> fc G Z+, p G N. 

j=o 

Definition 3.2.1 A sequence {A k }^ converges to A = [ay] € C mx ™ z/ 
lim aff = ciij, i = l,...,m, j = l,...,n lim Ak = A. 

k — >oo k — >oo 

A sequence converges p-Cesaro to A — [ay] if lim^oo Ak tP = A 

for p G Z + . j4 sequence {Ak}{f converges p-Cesaro exactly to A = [ay] z/ 
linife^oo Afe, p = A and {j4fc, P -i}/*L ^ oes converge. 

It is known (e.g. [Har49]) that if {A^} is p-Cesaro convergent then {A^} is 
also p + 1-Cesaro convergent. A simple example of exact 1-Cesaro conver- 
gent sequence is the sequence {X h }, where |A| = 1, A ^ 1. More generally, 
see [Har49] or Problem 1: 

Lemma 3.2.2 Let |A| = 1, A / 1. Then for p G N the sequence 
{(p-i) / ^' C J'fc^o * s exactly p-Cesaro convergent. 

We now show how to recover the component Z a ( Sa _ 1 - ) (A) for ^ X a G 
spec per j (A) using the notion of Cesaro convergence. 

Theorem 3.2.3 Let A G C nxn . Assume that p(A) > and X a G 
spec pori (A). Let 

(3.2.3) Ak = ^0y { h^_)\ s Q = index \ a . 
Then 

(3.2.4) lim Ak, p = Z Q ( So _ 1 ), p = index A — index X a + 1. 

k^oo 

The sequence Ak is exactly p-Cesaro convergent unless either spec peri (A) = 
{A Q } or index A < index A Q for any X ^ X a in spec pcri (A). In these 
exceptional cases lim^oo Ak = Z a (s a -i)- 
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Proof. It is enough to consider the case where X a = p(A) = 1. By 
letting cj)(x) = x k in (3.1.4) we get 

(3.2.5) ^* = EE 

i=l j=0 

So 

_ ' s ^ ( Sa 1)! kjk - l)...(fc -J + 1) k _j 

i=l j=0 J ' 

Since the components Zij, i = 1, ...,£ , j = 0, Si — 1 are linearly indepen- 
dent it is enough to analyze the sequence (j)^" 3 , fc = j, j + 1, ... 
Clearly for |A| < 1 and any j or for |A| = 1 and j < s a — 1 this sequence 
converges to zero. For Aj = 1 and j = s a — 1 the above sequence converges 
to 1. For | Ai| = 1, Xi ^ 1 and j > s a — 1 the given sequence is exactly 
j — s Q + 2 convergent to in view of Lemma 3.2.2. From these arguments 
the theorem easily follows. □ 

The proof of Theorem 3.2.3 yields: 

Corollary 3.2.4 Let the assumptions of Theorem 3.2.3 hold. Then 

(3.2.6) lim — j— V (*~ l }\ -A-) k = Z, s = index A. 
y ' w^ooAr + 1^ k 3 - 1 p{A)' 

If p(A) e spec (A) and index p( A) = s then Z = Z p ^A)( s -i)- Otherwise 
Z = 0. 



Problems 

1. Let |A| = 1, A 7^ 1 be fixed. Differentiate the formula 

r times with respect to A and divide by r! to obtain 

where f(\,r,£) are some fixed nonzero functions. Use the induction 
on r to prove Lemma 3.2.2. 
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2. Let 4>{x) be a normalized polynomial of degree p — 1. Prove that 
the sequence {0(fc)A fe }^ o for |A| = 1, ) ^ 1 is exactly p-Cesaro 
convergent. 

3. Let A e C nxn . For Aj € spec (A) let 

(3.2.7) Z l3 {A) = [zM]lv=i, J = 0, ....index A 4 - 1. 
Let 

(3.2.8) index M „Ai := 1 + max{j : zffi ^ 0, j — 0, index Aj — 1}, 

where index^Ai = if zffl = for j = 0, ...,index;Aj — 1. 

(3.2.9) P^u(A) = max{|Ai| : indcx^A, > 0}, 

where p pu {A) = — oo if index^A^ = for all Aj € spec (A). The 
quantities index ^ v \i 1 p^ v (A) are called the (p, v) index of Aj and 
the (p, v) spectral radius respectively. Alternatively these quantities 
are called the local index and the local spectral radius respectively. 
Show that Theorem 3.2.3 and Corollary 3.2.4 could be stated in a 
local form. That is for 1 < p, v < n assume that 

A Q = Pnu(A), s a = indcx^Aa, A k = {a$), A k = [a^k], A k , p = [a^kp] 
where A k and A k , p are given by (3.2.3) and (3.2.2) respectively. Prove 
lim a^ ykp = 4" (Sq_1)) , p = index^A - indcx M „A Q + 1, 



where z^ v — unless Ai = p^ u {A) e spec (A) and indcx^Ai =index M „ 
s. In this exceptional case z^ v — zfy s 

Finally A is called irreducible if p^ u (A) = p(A) for each p,, v = 1, ...,n. 
Thus for an irreducible A the local and the global versions of Theorem 
3.2.3 and Corollary 3.2.4 coincide. 



3.3 An iteration scheme 

Consider an iteration given by 

(3.3.1) x J+1 = Ax 4 + b, i = 0, 1, 
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where A £ C nXTl and x*,b £ C". Such an iteration can be used to solve a 
system 

(3.3.2) x = Ax + b. 

Assume that x is the unique solution of (3.3.2) and let y 4 := x 1 — x. Then 

(3.3.3) y l+1 =Ay\ i = 0,1,... 

Definition 3.3.1 The system (3.3.3) is called stable if the sequence 
y\ i = 0, 1, ... converges to zero for any choice of y° . The system (3.3.3) 
is called bounded if the sequence y l , i = 0, 1, ... is bounded for any choice 

ofy°. 

Clearly, the solution to (3.3.3) is y 4 = A l y°, i = 0, 1, ... So (3.3.3) is stable 
if and only if 

(3.3.4) lim A 1 = 0. 

Furthermore (3.3.3) is bounded if and only if 

(3.3.5) \\A'\\<M, z = 0,l,..., 

for some (or any) vector norm || • || : C™ xn — > R+ and some M > 0. For 
example one can choose the norm on C mxn to obtain the induced matrix 
norm: 

(3.3.6) ||B||= max |6 y |, £ = [6 y ] e C mx ". 

l<z<m,l<j<n 

See §7.4 and §7.7 for definitions and properties of vector and operator 
norms. 

Theorem 3.3.2 Let A £ C nxn . Then conditions (3.3.4) holds if and 
only if p(A) < 1. Conditions (3.3.5) hold if either p{A) < 1 or p(A) = 1 
and index A = 1 . 

Proof. Consider the identity (3.2.5). Since all the components of A are 
linearly independent (3.3.4) is equivalent to 

lim ()\i~ J = 0, Aj £ spec (A), j = 0, 1, index A; - 1. 

Clearly the above conditions are equivalent to p(A) < 1. 

Since all vector norms on C mxn are equivalent, the condition (3.3.5) 
is equivalent to the statement that the sequence (j)A* : ~"', k — 0,..., is 
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bounded for each Aj G spec (A) and each j e [0, index A^ — 1]. Hence 
p(A) < 1. Furthermore if |A^| = 1 then index Aj = 1. □ 

Problems 

1. Let A e C™ x " and ip be the minimal polynomial of A given by 
(2.5.11). Verify 

(3.3.7) ^ = EE^r-^- 

Use (3.1.5) or (3.1.15) to show 

(3.3.8) d -e At = Ae At = e At A. 
y ' dt 

(In general t may be complex valued, but in this problem we assume 
that t is real.) Verify that the system 

(3.3.9) ^ = Ax, x(t) e C" 
dt 

has the unique solution 

(3.3.10) x(t) = e A (*-*^x(t ). 

The system (3.3.9) is called stable if lim^oo x(t) = o for any solution 
(3.3.10). The system (3.3.9) is called bounded if any solution x(t) 

(3.3.10) is bounded on [t , oo). Prove that (3.3.9) is stable if and only 
if 

(3.3.11) 3?A < for each A € spec (A). 

Furthermore (3.3.9) is bounded if and only if each A € spec (A) sat- 
isfies 

(3.3.12) 3F£A < and index A = 1 if KA = 0. 

3.4 Cauchy integral formula for functions of 
matrices 

Let A e C" XI \ <f) e H(fi), where n is an open set in C. If spec (A) C Q it 
is possible to define <j>{A) by (3.1.4). The aim of this section is to give an 
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integral formula for 4>(A) using the Cauchy integration formula for 4>(X). 
The resulting expression is simply looking and very useful in theoretical 
studies of 4>(A). Moreover, this formula remains valid for bounded operators 
in Banach spaces (e.g. [Kat80]-[Kat82]). 

Consider the function 4>(x,X) = (A — x)^ 1 . The domain of analyticity 
of (f>(x, A) (with respect to x) is the punctured complex plane C at A. Thus 
if A g spec (A) (3.1.4) yields 

e 8,-1 

(3.4.1) (xi a)- 1 = E E ( A - >*r u+1) Zij- 

i=l j=0 

Definition 3.4.1 The function (XI — A) -1 is called the resolvent of A 
and is denoted by 

(3.4.2) R(X, A) = (XI - A) -1 . 

Let T = {IA, ...,rfe} be a set of disjoint simply connected rectifiable curves 
such that T forms the boundary dD of an open set D and 

(3.4.3) Dure SI, T = dD. 

For (f> e H(S1) the classical Cauchy integration formula states (e.g. [Rud74]) 

(3.4.4) 0( C ) = _^_jf(A-O-V(A)dA, (eD. 
Differentiate the above equality j times to obtain 

(3.4.5) = —L= f(X- C)-^" +1 V(A)dA, C € D, j = 0, 1, 2, ... 
j! 27TV — 1 Jr 

Theorem 3.4.2 Let fl be an open set in C Assume thatT = {T\, ...,Tk} 
is a set of disjoint simple, connected, rectifiable curves such that T is a 
boundary of an open set D, and T U D C CI. Assume that A £ C" xrl an d 
spec (A) C D. The for any <f> e H(f2) 

(3.4.6) 0(A) = — L= / R(A,A)0(A)rfA. 

27TV — 1 Jr 

Proof. Insert the expression (3.4.1) into the above integral to obtain 

£ s ■ — 1 

f R(X, A)4>(X)dX = £ £ (— L= / (A - A i )- (j+1) ^(A)dA)^, 
~T ~^ 2ttV-1 



2W-1 



i=l j=0 
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Use the identity (3.4.5) to deduce 

1 f D r, xV^V^A;) 



f R(\,A)<i>(\)d\ = ^f^*^Z ZJ . 



The definition (3.1.4) yields the equality (3.4.6). □ 
We generalize the above theorem as follows. 

Theorem 3.4.3 Let £1 be an open set in C. Assume ihatT = -{Ti, 
is a set of disjoint simple, connected, rectifiable curves such that T is a 
boundary of an open set D, and T U D C Cl. Assume that A e C" x ™ and 
spec (A) n T = 0. Let spec D (A) := spec (A) n D. The for any 4> € H(fi) 

(3.4.7) £ '£) ^>Z« = -±= I R{\A)<i>{X)d\. 

//spec D (A) = i/ien i/ie left-hand side of the above identity is zero. 
See Problem 1. 

We illustrate the usefulness of Cauchy integral formula by two examples. 

Theorem 3.4.4 Let A <E C nxn and assume that \ p e spec (A). Sup- 
pose that D and T satisfy the assumptions of Theorem 3.4-3 (ft = C). 
Assume furthermore that spec (A) n D = {A p }. Then the (p,q) component 
of A is given by 

(3.4.8) Z pq (A) = — L= / R(X, A)(X - A p )«dA. 

2sk\] 1 Jr 

(Z pq = for q> s p - 1.) 
See Problem 2. 

Our next examples generalizes the first part of Theorem 3.3.2 to a com- 
pact set of matrices. 

Definition 3.4.5 A set A C C nxn is called power stable if 

(3.4.9) lim (sup ||A fc ||) = 0, 

for some vector norm on C nxn . A set A C C nxn is called power bounded 
if 

(3.4.10) \\A k \\<K, for any A e .4 and fc = 0,1,..., 
for some positive K and some vector norm on C nxn . 
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Theorem 3.4.6 Let A C C nxn be a compact set. Then A is power 
stable if and only if p(A) < 1 for any A £ A. 

To prove the theorem we need a well known result on the roots of 
normalized polynomials in C[x] (e.g. [Ost66]). 

Lemma 3.4.7 Let p(x) = x rn + J2T=i aix m - 1 e C[x]. Then the zeros 
£i,...,£ m of p(x) are continuous functions of its coefficients. That is for 
a given ai,...,a n and e > there exists (5(e), depending on a\,...,a n , such 
that if \bi — ai\ < <5(e), i = 1, ...,m it is possible to enumerate the zeros of 
q(x) = x m + YT=i hx m ~ l by n x , ...,T] m , such that |r?j - &| < e. i = 1, ...,m. 
In particular the function 

(3.4.11) p(p)= max |&| 

l<^<m 

is a continuous function of a\, ...,a m . 

Corollary 3.4.8 The function p : C nxn — > R+, w/iic/i assigns to A e 
C nxn its spectral radius p(A) is a continuous function. 

Proof of Theorem 3.4.6. Suppose that (3.4.9) holds. Then by Theorem 
3.3.2 p(A) < 1 for each A £ A. Assume that A is compact and p(A) < 1. 
Corollary 3.4.8 yields 

p := maxp(A) = p(A) < 1, AeA. 
AeA 

Recall that (XI — A)^ 1 = [ j et P (xf-A) ]i ' w here Pij(X) is the (j, i) cofactor of 
A/ — A. Let Ai, A„ be the eigenvalues of A counted with multiplicities. 
Then for |A| > p 

n n 

|det (XI-A)\ = \H(X-X i )\>H(\X\-pr. 

l=\ l=\ 

Let p < r < 1. Since A is a bounded set, the above arguments yield that 
there exists a positive constant K such that ||(A7 — A) _1 || < K for each 
A e .4, |A| = r. Apply (3.4.6) to obtain 

(3.4.12) A? = — L= / (A/ - ^-^"dA, 

27TV-1 J|A|=r 

for each 4e A Combine this equality with the estimate ||A7 — yl) _1 || < K 
for |A| = r to obtain \ \A P \\ < Kr p+1 for any AeA. As r < 1 the theorem 
follows. □ 
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Theorem 3.4.9 Let A C C" x ™. Then A is power bounded if and only 

(3.4.13) ||(A7- A)- 1 ]] < pr^-r, for all At A&nd |A| > 1, 

|A| - 1 

for some vector norm \\ ■ \\ on C nxn and K > \\I n \\. 
Proof. For |A| > p(A) we have the Neumann series 

(3.4.14) (A7 _^-i = ^_. 

i=0 



Hence for any vector norm on C nx " 

I AI' - : ' 



^2, 114*11 

(3.4.15) ||(A7- A) _1 || < ^2 ixii+I' \M>P(A). 



i=0 

(See Problem 3.) Assume first that (3.4.10) hold. As A° = I n it follows 
that K > \\I„\\. Furthermore as each A € A is power bounded Theorem 
3.3.2 yields that p(A) < 1 for each A e A. Combine (3.4.10) and (3.4.15) 
to obtain (3.4.13). 

Assume now that (3.4.13) holds. Since all vector norms on C™ xn are 
equivalent we assume that the norm in (3.4.13) is the loo norm given in 
(3.3.6). Let A e A. Note that (XI - A) in invcrtible for each |A| > 1. 
Hence p(A) < 1. Let (XI - A)- 1 = [^^]. Here p(X) = dct (XI - A) 
is a polynomial of degree n and Pij(X) (the (j,i) cofactor of XI — A) is a 
polynomial of degree n — 1 at most. Let = [a^], p = 0, 1, ... Then for 
any r > 1 the equality (3.4.12) yields that 

13 2^^ 7| A | =r p(A) 2wJ p(re^ s ) 

Problem 6 implies that 

(p) A(2n-l)r^ p l3 (X) 4(2n-l)r^K 

1 ij 1 " tt(p+1) m M-' p(A) 1 " 7r(p+l)(r-l) ' 

Choose r = 1 + to obtain 

( 3.4.i 6) \a^\< 4{2n - 1)eK , i,j = l,...,n, p = 0,1,..., A e A. 

j 7T 



Hence \\A*\\ < 4(2 "; 1)eg . □ 
Problems 
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1. Use the proof of Theorem 3.4.2 to prove Theorem 3.4.3. 

2. Prove Theorem 3.4.4 

3. Let A E C nxn . Show the Neumann series converge to the resolvent 
(3.4.14) for any |A| > p(A). (You may use (3.4.1).) Prove (3.4.15) for 
any vector norm on C" x ™. 

4. Let f(x) be a real continuous periodic function on K with period 2tt. 
Assume furthermore that /' is a continuous function on R. (/' is 
periodic of period 2ir.) Then the Fourier series of / converge to / 



(e.g. [Pin09, Cor. 1.2.28]). 




(3.4.17) 



1 




Use integration by parts to conclude that 



(3.4.18) 




Assume that f'(6) vanishes exactly at m(> 2) points on the interval 
[0,2tt). Show that 



H^max^l/WI, 




(Hint. The first inequality of (3.4.19) follows immediately from (3.4.17). 
Assume that /' vanishes at < 9 < ••• < &m-i < 27r < 8 m = 9 + 2tt. 
Then 



| / f(9)e- 2 ^ ke d0\ < / \f(6)\d6 = 




\f(9i) - f(0i-i)\ < 2 max \f(6)\, i = 1, m. 



Use (3.4.18) to deduce the second part of (3.4.19).) 
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5. A real periodic function / is called a trigonometric polynomial of 
degree n if / has the expansion (3.4.17), where = for |fc| > n and 
a n ^ 0. Show 

(a) A non zero trigonometric polynomial f{6) of degree n vanishes at 
most 2n points on the interval [0, 2tt). (Hint. Let z = e^ 16 . Then 
/ = z~ n p(z)\\ z \ = i for a corresponding polynomial p of degree 2n.) 

(b) Let f(0) = |© be a nonconstant function, where g is a nonzero 
trigonometric polynomial of degree m at most and h is a nowhere 
vanishing trigonometric polynomial of degree n. Show that /' has at 
most 2(m + n) zeros on [0, 2ir). 

6. Let p(z), q(z) be nonconstant polynomials of degree m, n respectively. 
Suppose that q(z) does not vanish on the circle \z\ = r > 0. Let 
M := max| z | =r Show that for all k e Z 



(3.4.20) 



1_ f 27r pfre^ 9 ) 4Mmax(m + n,2n-l) 

27ri ^re^ 19 ) 6 7rmax(|fc|, 1) 

_ p( z ) _ p(z)gQ) 



ffint Let F(z) = ^^ — ,{,{ be a nonconstant rational function. 

v ' <l\ z ) q(z)q(z) 

Then F(re^ e ) = h(6) + V^T/ 2 (0), where / 1; / 2 as in Problem 5. 
Clearly |/i(6>)|, \f 2 (0)\ < M. Observe next that 



\q{re^ 



-W\\2 



Hence f[,f 2 vanish at most 2max(m + n, 2n — 1) points on [0, 2tt). 
Use (3.4.19) for f u f 2 to deduce (3.4.20). 

7. Let a > be fixed and assume that A C C nxrl . Show that the 
following statements are equivalent: 

(3.4.21) \\A k \\<k a K, for any A e .4 and fc = 0,1,..., 



(3.4.22) WiXI-A)- 1 ]] < (|A ^^° +a , forallAe4and|A|>l. 

Hint. Use the fact that (-l) k (- {1 + a) )k- a e [a, b], k = 1, ... for some 
< a < b, (e.g.[01v74], p'119).) 
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8. Let A e C nxn . Using (3.4.1) deduce 

(3.4.23) Z i(Si _!) = lim (x - X l ) s ^xl - A)~\ i = l,...,l. 

X—>\i 

let R(x, A) = [rfjiv]. Using the definitions of Problem 3 show 

(3.4.24) zl s ' -1 ' = lim (x — Xi) s rnuix), if s = index Ally A l > 0. 

9. A set A C C" x ™ is called exponentially stable if 

(3.4.25) lim sup||e At || = 0. 

Show that a compact set A is exponentially stable if and only if 
3?A < for each A £ spec (A) and each A £ A. 

10. A matrix B £ C nxn i s called projection (idempotent) if B 2 = B. Let 
L be a set of simply connected rectifiable curves such that T from a 
boundary of an open bounded set D C C. Let A £ C nxn and assume 
that T n spec (A) = 0. Define 



(3.4.26) P D (A) := 

A(D) := ; 




Show that Pd(A) is a projection. Pd(A) is called the projection of 
A on D, and A(D) is called the restriction of A to D. Prove 
(3.4.27) 

P D {A)= Z ^ A ^= E {\Z i0 + Z a ). 

Ai£spcc D (A) AiGspcc D (A) 

Show that the rank of Pd{A) is equal to the number of eigenvalues 
of A in D counted with their multiplicities. Prove that there exists a 
neighborhood of A such that Pu(B) and B(D) are analytic functions 
in B in this neighborhood. In particular, if D satisfies the assumptions 
of Theorem 3.4.4 then Pd{A) is called the projection of A on A p : 
P D (A) = Z p0 . 

11. Let B = QAQ^ 1 £ C" x ™. Assume that D satisfies the assumptions 
of Problem 10. Show that P D (B) = QP D {A)Q~ 1 . 

12. Let A £ C™ XTl and assume that the minimal polynomial tp(x) of 
A is given by (2.5.11). Let C" = Ui ® ... ® ILj, where each U p 
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is an invariant subspace of A (AXJ p C U p ), such that the minimal 
polynomial of A\U p is (x — X p ) Sp . Show that 

(3.4.28) Up = Z p0 C n . 

Hint. It is enough to consider the case where A is in the Jordan 
canonical form. 

13. Let Di satisfy the assumptions of Problem 10 for i = 1, ...fc. Assume 
that A n Dj = for i^j. Show that P Di (A)C n n P Dj (A)C n = {0} 
for i ^ j. Assume furthermore that Di n spec (A) ^ 0, i = 1, ...,k, 
and spec (A) c uj c =1 D i . Let 

P D ,(A)C" = span (yf \ y«), i = l,...,k, 
X=[y( 1 \...,yW,...,yi fe fc ) ]GC" x ". 

Show that 
(3.4.29) 

fc 

X^AX = ^2®Bi, spec (Bi) = D ; n spec (A), i = l,...,k. 



14. Let Ae C nxn and X p e spec (A). Show that if index A p = 1 then 

(A-\ 3 iy 

AjGspec (A),Aj7^A p 

Hint. Use the Jordan canonical form of A. 



Z p ,= J] (Ay _ J Xj)Sj , 5j =indexA,. 



3.5 A canonical form over 

Consider the space C" x ". Clearly C nx ™ can be identified with C" 2 . As 
in Example 1.1.3 denote by the set of analytic functions f(B), where 
B ranges over a neighborhood D(A,p) of the form (2.9.5) (p — p(f) > 0). 
Thus B = [bij] is an clement in H^ x ™. Let C € H^ xn and assume that 
C = C(B) is similar to B over H^. Then 

(3.5.1) C(B) = X- 1 (B)BX(B), 

where X(B) e H^ x " and det ^ 0. We want to find a "simple" from 

for C(B) (simpler than B\). Let Ma be the quotient field of (the set 
of meromorphic functions in the neighborhood of A). If we let X e M^ xn 
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then we may take C(B) to be R(B) - the rational canonical form of B 
(2.3.3). According to Theorem 2.3.8 R(B) e H^ x ™. However B and R(B) 
are not similar over in general. (We shall give below the necessary 
and sufficient conditions for B rts R(B) over H^.) For C(B) — [cij(B)] we 
may ask how many independent variables are among Cij(B), i,j = 1, ...,n. 
For X{B) = I the number of independent variables in C{B) — B is n 2 . 
Thus we call C(B) to be simpler than B if C(B) contains less independent 
variable than B. For a given C(B) we can view C(B) as a map 

(3.5.2) C(-) : D(A,p) -> C" x ™, 

where D(A,p) is given by (2.9.5), for some p > 0. It is well known, e.g. 
[GuR65] , that the number of independent variables is equal to the rank of 
the Jacobian matrix DC(-) over Ma 

(3.5.3) DC(jB):=( ^ (s))e irfx" 2 , 

where \i is the map given in Problem 2.8.2. 

Definition 3.5.1 Lefrank DC, rank DC(A) be the ranks ofDC(-), DC (A) 
over the fields Ma, C respectively. 

Lemma 3.5.2 Let C(B) be similar to B over Ha- Then 

(3.5.4) rank DC(A) > u(A, A). 

Proof. Differentiating the relation X~ 1 (B)X(B) = I with respect to 
bij we get 

ox- 1 _ . rl « r . 



So 



dbij dbi 



dC ,dX „,dX 



(3.5.5) — = X-\-{— X-!)B + Bi—X- 1 ) + E^X, 



dbij db^ Obi, 



where 



(3.5.6) Eij = [5 ia 5j }™% =1 € C mx ", i = 1, m, j = 1 
and m = n. So 

X{A)^-{A)X-\A) = APij PijA + Eij, P v = ^{A)X~\A). 
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Clearly, APij — PijA is in Range A, where 

(3.5.7) A = {I®A-A V ®I) : C" x " ^ C" x ™. 
According to Definition 2.9.1 dim Range A = r(A, A). Let 

(3.5.8) C" x ™ = Range i® span (L 1; L„ (a ,a))- 
As Eij, i,j — l,...,n is a basis in C nxn 

n 
i,j = l 

Let 

T p : = E "tfjj^) = X-^AKQp + rjXiA), Q p G Range (i), 

i,j=l l ° 

p=l,...,v(A,A). 

According to (3.5.8) T\, ...^T^^a.a) are linearly independent. Hence (3.5.4) 
holds. □ 

Clearly rank DC > rank DC(A) > i/(A, A). 

Theorem 3.5.3 Let A e C™ x " and assume that T\, T u i A<A -s be any 
f(A, A) matrices satisfying (3.5.8). Then for any nonsingular matrix P € 
C nxn it is possible to find X(B) e H^ xrl , X(A) = P, such that 

X- 1 (B)BX(B) = P- 1 AP+ fi^P^TiP, 

i=l 

(3.5.9) 

fi e H A , fi(A) = 0, i = l,...,v(A,A). 

Proof. Let Ri, R r {A,A) be a basis in Range A. So there exist Tj such 
that AT,, - T, t A = R t for i = 1, ...,r(A,A). Assume that X(B) is of the 
form 

r(A,A) 

X(B)P- 1 =1+ 9j{B)Tj, 9j G H A) - 0, j = 1, ...,r(A,A). 

(3.5.10) 

The theorem will follow if we can show that the system 

r(A,A) r(A,A) v(A,A) 

(3.5.11) B(I+ ]T g j T j ) = (I+ 9jT 3 )(A+ ^ /A) 
i=i j=i »=i 
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is solvable for some gi, g r (A,A), fi, •••> fv(A,A) € Ha which vanish at A. 
Clearly, the above system is trivially satisfied at B = A. The implicit 
function theorem implies that the above system is solved uniquely if the 
Jacobian of this system is nonsingular. Let B = A + F, F = [fy] e 
C" x ™. Let ai(F), f3j(F) be the linear terms of the Taylor expansions of 
fi(A + F),gj(A + F). The linear part of (3.5.11) reduces to 

r(A,A) r(A,A) v{A,A) 

p+ E ^ AT i= E E "<' <• 
j=i j=i i=i 

That is 

r(A,A) v(A,A) 

i' E h R i+ E <, < l <• 

3 = 1 i=l 

In view of (3.5.8) a\, a v (A, a ), Pi, ■■■■>Pr(A,A) arc uniquely determined by 
F. □ 

Note that if A = al then the form (3.5.9) is not simpler than B. Also 
by mapping T — > P~ 1 TP we get 

(3.5.12) C" xn = Range P~ { AP® span (P _1 riP, P _1 r„ (AiA) P). 

Lemma 3.5.4 Let B £ H^ xn . Then the rational canonical form of B 
over Ma is a companion matrix C(p), where p(x) = det (xl — B). 

Proof. The rational canonical form of B is C(pi, ...,Pk) is given by 
(2.3.3). We claim that k = 1. Otherwise p(x) and p'(x) have a common 
factor over Ma- In view of Theorem 2.1.9 implies that p(x) and p'(x) 
have a common factor over H^. That is any B G D(A, p) has at least 
one multiple eigenvalue. Evidently this is false. Consider C = P~ 1 BP 
where P e c«x" anc | j = p^AP is the Jordan canonical form of A. 
So C e D(J,p'). Choose C to be an upper diagonal. (This is possible 
since J is an upper diagonal matrix.) So the eigenvalues of C are the 
diagonal elements of C, and we can choose them to be pairwise distinct. 
Thus p(x) and p'(x) are coprime over Ma, hence k = 1. Furthermore 
Pi(x) = det (a;/ - C(p)) = det (a; J - B). □ 

Theorem 3.5.5 Le£ A e C" xn . T/ien B e H^ x " is simz/ar to the 
companion matrix C(p), p(x) = det (xl — B) over Ha if and only if 
v(A,A) = n. That is the minimal and the characteristic polynomial of 
A coincide, i.e. A is nonderogatory. 

Proof. Assume first that C(B) in (3.5.1) can be chosen to be C(p). 
Then for B = A we obtain that A is similar to the companion matrix. 
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Corollary 2.8.4 yields i/(A, A) = n. Assume now that v(A, A) = n. Accord- 
ing to (2.8.12) we have that i\{x) = i2{x) = ... = i n -i(x) = 1. That is, the 
minimal and the characteristic polynomials of A coincide, i.e. A is similar 
to a companion matrix. Use (3.5.9) to see that we may assume that A is a 
companion matrix. Choose I\ = E nil i = l,...,n, where E ni are defined in 
(3.5.6). 

It is left to show that Range A n span (E„i, ...,E nn ) = {0}. Suppose 
that T = J27=i a i E ™ e Range (A). Theorem 2.10.1 and Corollary 2.8.4 
yield that trTA k — 0, k = 0, 1, n — 1. Let a = (a.\, a n ). Since the 
first n — 1 rows of T are zero rows we have 

= trl\4 fe = aA k e n , ej = (^i, 8 jn ) T , j = 1, ...,n. 

For k = the above equality implies that a n = 0. Suppose that we already 
proved that these equalities for k = 0, £ imply that a n = ... = a n _i = 0. 
Consider the equality trTA i+1 = 0. Use Problem 2.4.9 to deduce 

i 

A e+1 e n = e„_£_i + ^2 f(e+i)j e n-j- 

So trTA e+1 = a n -g-i as a n = ... = a n -e = 0. Thus a n -t-\ = 0, which 
implies that T = 0. 

Theorem 3.5.3 yields that 

n 

C(B) = X- 1 (B)BX(B) = A + Y, fi(B)E ni . 

i=l 

So C(B) is a companion matrix. As det (xl — C(B)) — det (xl — B) it 
follows that C{B) = C(p). □. 
Problem 5 yields. 

Lemma 3.5.6 Let A l e C nzXni i = 1,2 and assume that 

C n iXni = Range A . 0gpan (Tf\...,T^ Ai Ai) ), i = 1,2. 

Suppose that A x and A 2 do not have a common eigenvalue. Then 

C („ 1+ „ 2 )x(n 1+ n 2 ) = Range Ai ^a 2(B 

span (if > e o, -,r« AiiAi) e o, o © r< 2) , o © r^ AaiAa) ). 

Theorem 3.5.7 Let A E C nxn . Assume that spec (A) consists of £ 
distinct eigenvalues where the multiplicity of Aj is n,- L for i — 

1, T/ien _B is similar over to i/ie matrix 
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e 

C(B)=J2®C i (B), C t (B)eHf"-, (Ai/n, - Ci(A)) n ' = 0, i = l,...,l. 

i=l 

(3.5.13) 

Moreover C (A) is the Jordan canonical form of A. 

Proof. Choose P in the equality (3.5.9) such that P _1 AP is the 
Jordan canonical of A and each P^L^F is of the form Yfj=i as follows 
from Lemma 3.5.6. Then (3.5.9) yields the theorem. □ 

Problems 

1. Let A = T,i=i® H ni, n = T,i=i n i- Partition any B g C nxn as a 
block matrix as A: B = [-By], B i3 g C"» XTl 3 7 i,j = 1 ; ... 5 fc. Using the 
results of Theorem 2.8.3 and Theorem 2.10.1 show that the matrices 

r QA7 = (rg""">)* g c nx ", 

rj;w = oe^, if (a,p)^(i,j), 

p( a ,/3,7) _ pi (pn a xna 
1 a/3 — fi n„7 t ^ , 

7 = 1, ...,min(n a ,n/5), a,/3 = 1, fc, 

satisfy (3.5.8). 

2. Let A be a matrix given by (2.8.4). Use Theorem 3.5.7 and Problem 
1 to find a set of matrices Li, ...,T v (a,a) which satisfy (3.5.8). 

3. Let A g C™ x " and assume that Aj is a simple eigenvalue of A, i.e. Aj is 
a simple root of the characteristic polynomial of A. Use Theorem 2.8.3 
to show the existence of X(B) G such that X(B) is an eigenvalue 
of B and X(A) = X t . 

4. Let A satisfy the assumptions of Theorem 3.5.7. Denote by D p 
an open set satisfying the assumptions of Theorem 3.4.4 for p = 
1,...,£. Let P k {B) be the projection of B g H^ x ™ on L> fe , k = 
I,..., I Problem 10 implies that P k {B) g FLf™, k = 1,...,£. Let 
P fc (A)C™ = span (x kl , x kllk ), k = !,...,£, B g D(A,p), where 
p is some positive number. Let X{B) g H^ x ™ be formed by the 
columns P k (B)x kl , P k (B)x kn * , k = !,...,£. Show that C(B) given 
by (3.5.1) satisfies (3.5.13). (This yields another proof of Theorem 
3.5.7.) 
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3.6 Analytic, pointwise and rational similar- 
ity 

Definition 3.6.1 Let ficC m and A,B E H(fi)" x ". Then 

(a) A and B are called analytically similar, denoted by A^B, if A and B 
are similar overH(fl). 

(b) A and B are called pointwise similar, denoted by A^B 7 if A{x) and 
B(x) are similar over C for all x 6 Qq, for some open set Qq D Q. 

T 

(c) A and B are called rationally similar, denoted by AfaB, if A and B are 
similar over the field of meromorphic functions A4(Cl). 

Theorem 3.6.2 Let ft C C m and assume that A,B G H(f2) nx ". Then 
AkB => A&B => AkB. 

Proof. Suppose that 
(3.6.1) B(x) = p- 1 (x)A(x)P(x), 

where P^P- 1 G H(f2)" x ™. Let x G ft. Then (3.6.1) holds in some neigh- 

borhood of x . So AfaB. Assume now that AmB. Let C(pi, ...,pk) and 
C(qi, ...,qe) be the rational canonical forms of A and B respectively over 
M(Q). Then 

C( Pl ,..., Pk ) = S(x)- 1 A(x)S(x), C(q 1 ,...,q e )=T(x)- 1 B(x)T(x), 
S{x),T(x) € H(ft)" x ", det A(x) £ 0, det B(x) # 0. 

Theorem 2.3.8 yields that C{ Pl , ...,p k ), C(q u q e ) G H(ft)" xn . Let fl D 
ft be an open set such that A,B,S,T G H(f2 )" xn and A(x) and B(x) 
are similar over C for any x € fio- Let xq G fig be a point such that 
det S'(a;o)T(xo) ^ 0. Then for all x G D{x Q ,p) C{pi,..., Pk ) = C(qi, q e ). 
The analyticity of C(pi, ...,p k ) and C(gi ,...,<#) imply that these matrices 
are identical in H(f2), i.e. AmB. □ 

Assume that A^aB. Then according to Lemma 2.9.4 the three matrices 

I <g> A(x) - A(x) T <g> 7, 7 (g) A(z) - B(a;) T <g> 7, 7 <g> B(x) - B(x) T <g> 7 
(3.6.2) 

are equivalent over H(fi). Theorem 2.9.3 yields. 

Theorem 3.6.3 7e£ A, B e H(f2)™ x ™. Assume that the three matrices 

P 

in (3.6.2) are equivalent over~H.(Q). ThenAfaB. 
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Assume that O C C is a domain. Then H(f2) is EDD. Hence we can 
determine when these matrices are equivalent. 

The problem of finding a canonical form oi A £ f2" xn under analytic 
similarity is a very hard problem. This problem for the ring of local analytic 
functions in one variables will be discussed in the next sections. We now 
determine when A is analytically similar to its rational canonical form over 

, the ring of local analytic functions in the neighborhood of ( e C m . 

For A, B e H(fi) nx ™ denote by r(A, B) and v(A, B) the rank and the 
nullity of the matrix C = I ® A — B T (g> I over the field A4(Ci). Denote by 
r(A(x),B(x)) and v(A(x), B(x)) the rank of C(x) over C. As the rank of 
C(x) is the largest size of a nonvanishing minor, we deduce 

r(A(0,B(0) < r(A(x),B(x)) < r(A, B) 

(3.6.3) 

v{A, B) < v{A{x),B{x)) < u(A(0,B(0), x e D((, p) 

for some positive p. Moreover for any p > there exists at least one 
x e D((, p) such that 

(3.6.4) r{A{x ),B{x Q )) = r{A, B), u{A{x , B(x )) = v{A, B). 

Theorem 3.6.4 Let ( e C m and A e iJ" x ". Assume that C{p u ...,p k ) 
is the rational canonical form of A over M( and C(cri, ■■■,cr^) is the ratio- 
nal canonical form of A{C,) over C. That is pi = pi(X,x) and o-j{\) are 
normalized polynomials in A belonging to HJA] and C[A] respectively for 
i = 1, k and j = 1, ...,£. Then 

(a) £> k; 

(b) nLo *e-iW\ TltoPk^ for q = 0,l, k - 1. 
Moreover i — k and Pi(X, () = crj(A) for i — 1, k if and only if 

(3.6.5) r(A(0,B(Q)=r(A,B), i/(A(Q,B(Q) = v(A,B), 
which is equivalent to the condition 

(3.6.6) r(A(0,B(C)) = r(A(x),B(x)), 
u(A(0,B(C)) - u(A(x),B(x)), x e D(C,p) 

for some positive p. 
Proof. Let 

i j 

u n - k +i(X, x) = J| p a (X, x), v n - i+j (X) = Y[ o-p(X), 

a=l /3=1 

i = 1, k, j = 1, ...,£, 

u a (X, x) = vp(X) = 1, for a < n — k, [3 <n — I. 
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So Ui(X, x) and i>i(A) are the g.c.d. of all minors of order i of matrices 
XI — A and XI — A(() over the rings A4JA] and C[A] respectively. As 
Ui(X,x) € H^[A] it is clear that tij(A, £) divides all the minors of J — A(Q 
of order i. So Wj(A, £)|t;,(A) for i = 1, ...,n. Since v„_£ = 1 it follows that 
u„_£(A,a;) = 1. Hence k < £. Furthermore 

u n (X, x) = det (A/ - A(x)), v n (X) = det (XI - A(Q). 

Therefore u n (X, £) — v n (X) and | ■ This establishes claims (a) 

and (b) of the theorem. Clearly if C(q\, ...,qi) = C(pi, ...,Pk)(C) then k = 
I and Pi(X,() = qi(X) for i = 1, ...,£. Assume now that (3.6.5) holds. 
According to (2.8.12) 

k 

v(A, A) = - !) dc g Pk-i+i{X, x), 

»=i 

t 

v(A(C),A(Q) - ^(2. ? - l)deg qe- j+ i(X). 

3 = 1 

Note that the degrees of the invariant polynomials of XI — A and XI — A(Q 
satisfy the assumptions of Problem 2. From the results of Problem 2 it 
follows that the second equality in (3.6.5) holds if and only if k = £ and 
deg pi(X,x) — degqi(X) for i = l,...,k. Finally (3.6.3-3.6.4) imply the 
equivalence of the conditions of (3.6.5) and (3.6.6). □ 

Corollary 3.6.5 Let A e H" x ™. Assume that (3.6.6) holds. Then 

a p 

AkB if and only if AmB. 

p 

Proof. According to Theorem 3.6.2 it is enough to show that AmB 
implies that AkB. Since A satisfies (3.6.6) the assumption that A&B 
implies that B satisfies (3.6.6) too. According to Theorem 3.6.4 A and B 
are analytically similar to their canonical rational form. From Theorem 
3.6.2 it follows that A and B have the same rational canonical form. □ 



Problems 

1. Let 



A(x) = 



x 




B(x) = 



"0 x 1 ' 
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Show that A(x) and B(x) are rationally similar over C(x) to H 2 = 
A(l). Prove that 

A ^H 2 , B ^H 2 , A&B, A ^B 

over C[x]. 

2. Let n be a positive integer and assume that {li}™, {mjjj are two 
nonincreasing sequences of nonnegative integers satisfying 

k k 
^2?i<^2'm l , fc = l,...,n-l, 

i=i »=i 

1=1 4 = 1 

Show (by induction) that 

n n 

2(2* - l)mi < 2(2i - 1)^ 

i=l i=l 

and equality holds if and only if ^, = mj, z = 1, n. 

3. Let £„ e C, n = 1, and lim„^oo = Suppose that C C is a 
connected set and £„ G 0, n = 1, £ G ^- Recall that if / G H(fi) 
and /(£„) = 0, n = 1, then / = 0. Show that for A,B G H(f2)" x ™ 
the assumption that A(( n ) sa £?(£„), n = 1, implies that 

3.7 A Global Splitting 

From this section to the end of the chapter we assume that fi is a domain 
in C. We now give a global version of Theorem 3.5.7. 

Theorem 3.7.1 Let A G H(fi)" x ™. Suppose that 

(3.7.1) det (XI - A{x)) = 0i(A, x)<j> 2 {\ x), 

where <fii,(f> 2 are two nontrivial normalized polynomials in H(f2)[A] of posi- 
tive degrees n\ and n 2 respectively. Assume that (</>i(A, xq), (f> 2 (\, Xo)) = 1 
for each x n G ft. Then there exists X G GL(n, H(i?)) such that 

X- 1 (x)C(x)X(x) = d(x) 8 C 2 (x), 

(3.7.2) 

Ci(x) G H(il)™ iXni , det(XI-C i (x)) = (f> i (X,x), * = 1,2. 



3. 8. FIRST VARIATION OF A GEOMETRICALLY SIMPLE EIGENVAL UEl 2 1 



Proof. Let Pi{x) be the projection of A(x) on the eigenvalues of A(x) 
satisfying <pi(X,x) — 0. Since (</>i(A, Xo), <fe(A, xo)) = 1 it follows that 
Pi(x) £ H(fi)" x ™ for i = 1,2. (See Problem 3.4.10.) Also for any x the 
rank of Pj(xo) is rij. Since H(f2) is EDD each P,(x) can be brought to the 
Smith normal form 

Pi(x) = U t {x) diag^Oc), e«(x), 0, .... 0)^(x)), 
tTi.Vj e GL(n„ H(fi)), * = 1,2. 

As rank Pi(xo) = nj for any xq G f2 we deduce that = 1, .7 = 1, n, i — 

1,2. Let u^(x), Un'(i) be the columns of Ui{x) for i = 1,2. As V € 
GL(n, H(J2) we obtain 

(3.7.3) P(x)C n = span(u«(x),...,u«(x)), 

for any Let 

X(x) = [U^^),...,!!^)^)^^^),...,!!^)^)] G H(fi)»* n . 

According to Problem 3.4.13 det X(x a ) ^ for any x e H(fi). So 
X(x) G GL(n,H(J7)). Then (3.7.2) follows from (3.4.29). □ 



3.8 First variation of a geometrically simple 
eigenvalue 

Theorem 3.8.1 Let A(x) be a continuous family of nxn complex val- 
ued matrices for \x — x | < S, where the parameter x is either real or 
complex. Suppose that 

(3.8.1) A{x) =A + (x- x )A 1 + |x - x |o(l). 

Assume furthermore that X is a geometrically simple eigenvalue of A of 
multiplicity m. Let xi,...,x TO and yi,...,y TO be eigenvectors of A and 
Aj respectively corresponding to A , which form a biorthonormal system 
yjx.j = 8ij 7 i,j = 1, m. Then it is possible to enumerate the eigenvalues 
of A(x) by Aj(x), i= l,...,n, such that 

(3.8.2) Aj(x) = A + (x - x )Mj + \x - x |o(l), i=l,...,m, 
where fii, n m are the eigenvalues of the matrix 

(3.8.3) S=[ 8ij ] G C mxm , Sij = yi T A lXj , i,j = l,...,m. 
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Proof. By considering the matrix P _1 A(x)P, for an appropriate P e 
GL(n, C), we can assume that A is in the Jordan canonical form such that 
the first to diagonal entries of A are A . The proofs of Theorems 3.5.3 and 
3.5.7 implies the existence of 

X(B) = I + Z(B), ZeH" x ", Z(0)=0, 

such that 

e 

(3.8.4) X- 1 (B)(A + B)X(B) = Y / ®C t (B), d(0) = X I m . 

»=i 

Substituting 

B{x) = A{x) -A = (x - x )A 1 + \x- x \o(l), 
X(x) = X(B(x)) =I+(x- x )X 1 + \x- x \o(l) 

we get 

C(X) = X- 1 A(x)X(x) = A a + (x- x )(A 1 + A Xi - 1^ ) + \x - x \o(l). 

According (3.8.4) Xi(x), X m (x) are the eigenvalues of C\{B{x)). As 
Ci(B(x )) = A / m , by considering (Ci(B(x)) — XoI m )/(x — x ) we deduce 
that (Xi(x) — Xq)/(x — x ) are continuous functions at x . Also 

(d(S(x)) - X I m )/(x - x ) - [v l T (A 1 + A Q X, - Xi^o)uj]^ =1 + o(l), 

where = Vj = ((5 il7 <5j„) T for i = 1, ...,m. Since and v$ are the 
eigenvectors of A and A J respectively corresponding to A for i = 1, to, 
it follows that v i r (^loAi — XiAo)uj = for i, j = 1, to. This establishes 
the result for a particular choice of eigenvectors Ui, u m and vi, v TO . 
It is left to note that any other choice of the eigenvectors xi, ...,x m and 
yi, ...,y m , which form a biorthonormal system amounts to a new matrix 
Si which is similar to S. In particular S and Si have the same eigenvalues. 

□ 



Problems 



1. Let A(x) 



Find the eigenvalues and the eigenvectors of 



1 

x 

A(x) in terms of y/x. Show that (3.8.2) does not apply for xq = 
in this case. Let B(x) — A(x 2 ). Show that (3.8.2) holds for xo even 
though A = is not geometrically simple for B(0). 
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3.9 Analytic similarity over Hq 

Let A,B E H" x ". That is 

OO 

A(x) = ^A k x k , \x\<r(A), 

k=0 

(3.9.1) 

OO 

B{x)^^2B k x k , \x\<r(B). 

fc=0 

Definition 3.9.1 For A,B € H" x ™ to T](A,B) and n p (A,B) be the 
index and the number of local invariant polynomials of degree p of the matrix 
I n <g> A(x) — B(x) T <g> /„ respectively. 

Theorem 3.9.2 Let A, B e Hg X ™. Then A and B are analytically 
similar over H if and only if A and B are rationally similar over H and 
there exists 7](A,A) + 1 matrices T , ...,T V £ C nxn (n = t)(A,A)), such that 
dct T ^ and 

k 

(3.9.2) ^A i T k - i -T k - i B i = 0, k = 0,..., v (A,A). 

i=0 

Proof. The necessary part of the theorem is obvious. Assume now that 
A(x)&B(x) and the matrices T , ...,T V satisfy (3.9.2), where T e GL(n,C). 
Put 

v 

C(x) = T(x)B(x)T-\x), T(x) = J2 T kX k - 

k=0 

As det T ^ we deduce that B(x)k,C(x). Hence A(x)k,C(x). In 
particular r(A,A) — r(A,C). Also (3.9.2) is equivalent to A(x) — C(x) = 
x^ +1 0{\). Thus 

(/„ ® A(x) - A(x) T ® J„) - (J„ ® A(x) - C(a;) T ® J n ) = x" +1 0(l). 

In view of Lemma 1.14.2 the matrices (I n ®A(x) — A(x) T ®I n ), {I n ®A(x) — 
C(x) T <g> /„) are equivalent over H . In particular n(A, A) — rj(A, C). Also 
1,0, ...,0 satisfy the system (3.9.2) where Bi = C\,i = 0, 1, ...,r]. Theorem 
1.14.3 yields the existence P(x) e Hg X ™ such that 

A(x)P(x) - P(x)C(x) = 0, P(0) = /. 
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Hence A(x)&C(x). By the definition C(x)SsB(x). Therefore A(x)&B(x). 

□ 

Note that if T](A, A) = the assumptions of Theorem 3.9.2 are equiv- 
alent to A(x)~B(x). Then the implication that A(x)~B(x) follows from 
Corollary 3.6.5. 

Suppose that the characteristic polynomial of A(x) splits over H . That 

is 

n 

(3.9.3) det (XI — A(x)) = JJ(A — Xi(x)), Xi(x) e H , i = 1, ...,n. 

i=l 

As H is ED Theorem 2.5.4 yields that -A(x) is similar to an upper triangular 
matrix. Using Theorem 3.5.7 and Theorem 2.5.4 we obtain that A(x) is 
analytically similar to 

C(x) = ®UiCi(x), Ci(i)eHJ iX "', 

(3.9.4) 

(aj ni - Ci(0)) ni = 0, a t = A ni (0), a, ^ a, for i ^ j, i, j = 1, ...,£. 

Furthermore each Ci(x) is an upper triangular matrix. In what follows we 
are more specific on the form of the upper triangular matrix. 

Theorem 3.9.3 Let A(x) e Hg X ™. Assume that the characteristic 
polynomial of A{x) splits in Ho.Then A{x) is analytically similar to a block 
diagonal matrix C(x) of the form (3.9.4) such that each Ci{x) is an upper 
triangular matrix whose off-diagonal entries are polynomial in x. More- 
over, the degree of each polynomial entry above the diagonal in the matrix 
Ci(x) does not exceed rj(Ci,Ci) fori— 1,...,£. 

Proof. In view of Theorem 3.5.7 we may assume that £ = 1. That is, 
A(0) has one eigenvalue ag. Furthermore, by considering A(x)— a^I we may 
assume that A(0) is nilpotent. Also in view of Theorem 3 we may assume 
that A(x) is already in the upper triangular form. Suppose in addition to 
all the above assumptions A(x) is nilpotent. Define 

X fc = {y: A k y = 0, yeHJ}, fc = 0,l,...,. 

Then 

{0} = x c Xl c x 2 c . . . C Xp = Hg. 

Using Theorem 1.12.3 one can show the existence of a basis yi(x), y n ( x ) 
in Hq, such that yi(x), ...,y^, k (x) is a basis in X& for k = l,...,p. As 
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A(a;)Xfe + i c Xfe we have 

A ( x )yj = y^9ijyj(x), ipk < j < ipk+i- 
i=i 

Define gij = for i > V'fe and Vfc < j < V"fc+i- Put 

G(x) - foy]?, T(x) = [ yi (x),...,y n (x)} e H^". 

Since yi(x), y„(x) is a basis in Hg we deduce that T{x) e GL(n,H ). 
Hence 

G(x) = T- 1 {x)A(x)T(x), s = r)(A, A) = rj(G, G). 

Let 

OO k 

G(x) = Y,GjX j , G^^^GjX 1 , fc = 0,l,...,. 
We claim that G^^G^x). First note that 

(I n ® G(x) - G(x) T ® /„) - (J„ ® G^(x) - G^(x) T ® /„) = x s+1 0(l). 

Lemma 1.14.2 implies that the matrices (I n ® G(x) — G(x) T ® /„), (/„ ® 
G^(x) - G( s )(a ; ) T ® 4) have the same local invariant polynomial up to 
the degree s. So r(G, G) < r(G< s \ G^) which is equivalent to 

(3.9.5) v{G {s \G is) ) <v{G,G). 
Let 

Y fc = {y = (yi, ...,y„) T : % ■ = for j > Vfe}, k = 0,...,p. 

Clearly if g^ = then (i, j) — th entry of G( s ) is also equal to zero. By the 
definition gij(x) = for i > ip k and ipk < j < V'fe+i- So G( s '(x)Y fe+ i C Y fe 
for k = 0, — 1. Theorem 2.11.2 implies 

(3.9.6) is(G(x ),G(x )) < u(G^(x ),G^(x )) 

for all xq in the neighborhood of the origin. Hence v(G,G) < v(G<- s \ 
This establishes equality in (3.9.5), which in return implies equality in 
(3.9.6) for < | a; 1 < p. Theorem 2.11.2 yields that G(x ) « G^(x ) 
for < | a; 1 < p. From Theorem 3.6.2 we deduce that G«G^ . As 
G(x)I - IGW = x s+1 0(l) Theorem 3.9.2 implies that G&G^ . This es- 
tablishes the theorem in case that A(x) is a nilpotent matrix. 
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We now consider the general case where A{x) is an upper triangular 
matrix. Without loss of generality we may assume that A(x) is of the form 

A{x) — [Aij]{, Aij e H * 3 , 

(3.9.7) A ij {x)=0fovj <i, (Au(x) — \ i (x)I ni ) ni = 0, 
Xi Xj{x), for i ± j, i,j = 1, ...,£. 

We already showed that 

A H {x) = T 4 ( a; )- 1 F J ,(x)T l (x), Tj g GL(n,H ), 

and each Fa(x) — \i(x)I ni is a nilpotent upper triangular matrix with poly- 
nomial entries of the form described above. Let 

e 

T(x) = J2 T i(x), G(x) = [Gij{x)]i = T(x)~ 1 A(x)T(x). 

As \i(x) ^ ^j(x) for i ^ j Problem 3 implies v(G,G) — Yli=i v (Gu,Gn). 
Let G^ix) = [G\f] be defined as above. Theorem 2.10.2 implies 

KG«,G( fe ))>^,(G«,G«). 

i=l 

Using Theorem 2.11.2 as above we obtain v(Ga, Ga) < v{G^\ G$). Com- 
bine the above inequalities we obtain v{G, G) < i/(GW,G' 8 '). Compare this 
inequality with the inequality (3.9.5) to deduce equality in (3.9.5). Hence 

(3.9.8) v{G^,G^>) = v{G ii ,G ii ), 
Let 

CO 

Di(x) = Xi(x)I ni = ^Dijx 1 , 

3=0 

(3.9.9) 

D(x) = ®UiDi{x), D^ k \x) 

Then (3.9.8) is equivalent to 

HG^ - Dl'\GW - D { t s) ) = v{G u -D^Gu-Du), i = l,...,£. 

As above Theorem 2.11.2 yields that G^ - D\ s) ^Gu - A => G^f - 
+ Di^Gu. Since A, (a;) ^ Aj(a;) for ! ^ j we finally deduce that 



i=l,... 



3=0 
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G^G^ - + D. Also GI - I(G& - £>( s ) + D) = x s+1 0(l). Theorem 
3.9.2 yields G^G (s) - + D. The proof of the theorem is completed. □ 

Theorem 3.9.4 Let P(x) and Q(x) be matrices of the form (3.9.4) 
»=i 

{ail mi - P t (Q)) m * = 0, a, ^ aj for i ± j, i,j = 1, ...,p, 

(3.9.10) 

Q(x) = ® 9 j=1 Qj(x), Qj(x) e H™ 3 *™ 3 , 

- Qi(0))" j = 0, A ^ & for t ^ j, i, j = 1, 

Assume furthermore that 

(3.9.11) a, = Pi, i= 1, a, ^ [3j, 
i = t + l, ...,p, j = t+ 1, < t < min(p, <jr). 

TTien i/ie nonconstant local invariant polynomials of I ® P (a;) — Q(x) T ® / 
are i/ie nonconstant local invariant polynomials of I <g> Pj(a;) — Qi(x) T <g> / 
/or i = 1, i. TTiat is 

t 

(3.9.12) Kp (P,Q) = ^ Kp (P i ,g i ), p=l,...,. 

»=i 

In particular if C(x) is of the form (3.9.4) then 

(3.9.13) V{C,C) = max ri(Ci,Ci). 

Proof. Theorem 1.14.3 implies k p (P, Q) = dim W p _i —dim W p , where 
W p C C nxn is the subspace ofnxn matrices X such that 

k 

(3.9.14) ^Pk-jXj - XjQk-j = 0, k = 0, ...,p. 
Here 

oo oo 

p(x) = £p^, p(x) = £pfV, p, = e? =1 ff , 

oo oo 

g(x) = 5^Qy, Q,(x) = ^gjV, Qj = ©LiO?- 

j=0 i=l 
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Partition Xj to [X^], X^> £ C m °®"f , a = 1, 0=1, q. We claim 
that xfp — if either a> t+1, or > t + 1, or a 0. Indeed in view of 
Lemma 2.8.1 the equation P^Y — YQ\!p = has only the trivial solution 
for a, satisfying the above conditions. Then the claim that xf^ = 
follows by induction. Thus (3.9.14) splits to the system 

3=0 

Apply the characterizations of k p (P,Q) and K p (Pi,Qi) for i — l,...,t to 
deduce (3.9.12). Clearly (3.9.12) implies (3.9.13). □ 

We conclude this section by remarking that main assumptions of Theo- 
rem 3.9.3, the splitting of the characteristic polynomial of A(x) in H , is not 
a heavy restriction in view of the Weierstrass preparation theorem (Theo- 
rem 1.7.4). That is the eigenvalues of A(y m ) split in H for some value of 
to. Recall that to can be always be chosen nl, i.e. the minimal to divides 
n\. Problem 1 claims A(x)§iB(x) ^=^> A(y m )SiB(y m ) . In view of Theo- 
rem 3.9.3 the classification problem of analytic similarity classes reduces to 
the description of the polynomial entries which are above the diagonal (in 
the matrix C in Theorem 3.9.3). Thus given the rational canonical form of 
A(x) and the index rj{A, A) the set of all possible analytic similarity classes 
which correspond to A is a certain finite dimensional variety. 

The case n — 2 is classified completely (Problem 2). In this case to 
a given rational canonical form there are at most countable number of 
analytic similarity classes. For n = 3 we have an example in which to a 
given rational canonical form there the family of distinct similarity classes 
correspond to a finite dimensional variety (Problem 3). 

Problems 

1. Let A(x), B(x) £ Hg X " and let to be a positive integer. Assume that 
A(y m )T(y) = T(y)A(y m ) where T(y) £ U r ^ xn . Show 

_. m 

A(x)Q(x) = Q(x)B(x), Q(y m ) = - £ T{ye"^), Q(x) £ H^ x " 

171 k=i 

Prove A(x)&B(x) A(y m )^B(y m ). 
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2. Let A{x) e Hq X2 and assume that 

det (A/ - A(x)) = (A — Ai(x))(A - A 2 (x), 

i = 1,2, 



A;(x) = ^ A ? } ^ eH °' 



A 

J 

P 



(i) 



n « ^ a (2) 

Ai(x) = A 2 (x). 



"I < V < 00, 



Show that A(x) is analytically similar either to a diagonal matrix or 
to 



B(x) 



Ai(x) 




A 2 (x) 



fc = 0,...,p(p>0). 



Furthermore if A(x)«i?(x) then r/(A, A) = k. {Hint: Use a similarity 
transformation of the form DAD^ 1 , where D is a diagonal matrix.) 

3. Let A(x) e H;j x3 . Assume that 

A(i)«C(p), p(A,x) = A(A - x 2m )(A - x 4m ), m > 1. 
Show that A(x) is analytically similar to a matrix 



B(x, a) = 



x fcl 
x 2m 




a(x) 
x fe2 

„4tb 



< fci,fc 2 < oo (x°° = 0), 



where a(x) is a polynomial of degree 4m — 1 at most. (Use Problem 

2.) Assume that k\ = k 2 = m. Show that B(x, a)SiB(x, b) if and only 
if 

(1) if a(0) 7^ 1 then & - a is divisible by x' m . 

(2) if a(0) = 1 and = 0, i = 1, fc - 1, ^ for 1 < fc < m 
then 6 — a is divisible by x m+fe . 

(3) if a(0) = 1 and = 0, i = 1, ...,m then 6- a is divisible by x 2m . 

Then for k\ = fc 2 = m and a(0) € C\{1} we can assume that a(x) is a 
polynomial of degree less than m. Furthermore the similarity classes 
of A(x) is uniquely determined by such a(x). These similarity classes 
are parameterized by C\{1} x C m_1 (the Taylor coefficients of a(x)). 

Let P and Q satisfy the assumptions of Theorem 3.9.4. Show that P 
and Q are analytically similar if and only if 



p=q = t, mj = n», Pi(x)mQi(x), 



l,...,t. 
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3.10 Similarity to diagonal matrices 

Theorem 3.10.1 Let A(x) e Hq xn and assume that the characteristic 
polynomial of A(x) splits in Ho as in (3.9.3). Let 

(3.10.1) B(x) = diag(Ai(a;),...,A n (a;)). 

Then A(x) and B(x) are not analytically similar if and only if there exists 
a nonnegative integer p such that 

K p (A, A) + k p (B, B) < 2k p (A, B), 

(3.10.2) 

Kj(A,A) + K j (B,B) = 2n j (A,B), j = 0,...,p-l, ifp> 1. 

In particular A(x)~B(x) if and only if the three matrices given in (2.9.4) 
are equivalent overH Q . 

Proof. Suppose first that (3.10.2) holds. Then the three matrices in 

a a 

(2.9.4) are not equivalent. Hence A(x) fcB(x). Assume now that A(x) feB(x). 
Without a loss in generality we may assume that A(x) — C(x) where C(x) 
is given in (3.9.4). Let 

B(x) = ® e j=1 Bj{x), Bj{0) = aj I ni , j = 1, -,£■ 

We prove (3.10.2) by induction on n. For n = 1 (3.10.2) is obvious. Assume 
that the (3.10.2) holds for n < N - 1. Let n = N. If A(0) ^ B(0) then 
Theorem 2.9.2 implies the inequality (3.10.2) for p — 0. Suppose now 
A(0) w B(0). That is Aj(0) = Bj(0) = ajl nj , j = 1,...,£. Suppose first 
that l>\. Theorem 3.9.4 yields 

£ £ 

k p (A, A) — ^ ' K p (Aj, Aj), k p (A, B) = y ' K p (Aj, Bj), 

i 

Kp(B,B) = J2* P (B j ,B j ). 

a a 

Problem 4 implies that A(x) fcB(x) <=^> Aj(x) feBj(x) for some j. Use 
the induction hypothesis to deduce (3.10.2). It is left to consider the case 

A(0) = B(0) = ao, k (A, A) = k (A, B) = k (B, B) = 0. 

Let 

A W(x) = A{x) - ao1 , B^{x)= B{x) - ao1 . 

x x 
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Clearly 

k p (A,A) = k p (A,B) = ^(A™, 

Kp(B,B) = K p _i(BW,BW). 

Furthermore <==^> J 4^ 1 )(a;)wi?( 1 )(a;). Continue this process. If 

at some (first) stage k either A( fe )(0) 56 i?( fe )(0) or A^ k \Q) has at least two 
distinct eigenvalues we conclude (3.10.2) as above. Suppose finally that 
such k does not exist. Then A{x) = B(x) = X(x)I, which contradicts the 

a 

assumption A{x) fcB(x). □ 



3.11 Strict similarity of matrix polynomials 

Definition 3.11.1 Let A(x),B(x) £ C[x] nxn . Then A(x) and B(x) 
are called strictly similar (AmB) if there exists P £ GL(n, C) such that 
B(x) = PA(x)P~ 1 . 

Definition 3.11.2 Let £ be a positive integer and (Ao, A\, At), (Bo, Bt) £ 
(C nxn ) +1 . Then (Aq, Ai, At) and (Bo, Bt) are called simultaneously 
similar (A , A\, At) w (Bo, Bt) if there exists P £ GL(n, C) such that 
Bi = PA 1 P-\i = 0,...,1, i.e. (B ,B 1 ,...,B e ) = P(A ,A 1 ,...,Ae)P- 1 . 

Clearly 

Proposition 3.11.3 Let 

£ t 

(3.11.1) A(x) = MX*, B(x) = Y B lX l £ C[x] nxn . 

i=0 i=0 

Then (A&B) if and only if (A ,A 1 , ...,A t ) w (B ,...,Bt). 

The problem of simultaneous similarity of matrices, i.e. to describe the 
similarity class of a given m (> 2) tuple of matrices or to decide when a 
given two tuples of matrices are simultaneously similar, is a hard problem. 
See [Fri83] . There are some cases where this problem has a relatively simple 
solution. 

Theorem 3.11.4 Let I > 1 and (A , A t ) € (C nxn ) e+1 . Then (A , A e ) 
is simultaneously similar to a diagonal tuple (B ,-..,Bt) £ C 1X ") <+1 , i.e. 
each Bi is a diagonal matrix, if and only if A , At are £+1 commuting 
diagonable matrices: 

(3.11.2) A t Aj = AjAi, i,j = 0,...,£. 
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Proof. Clearly if (A ,... 7 Ag) is simultaneously similar to a diagonal 
tuple then A ,...,Ae a set of commuting diagonal matrices. Assume that 
A , Ai a set of commuting diagonal matrices. We show that (^4oj ■•■) Ag) 
is simultaneously similar to a diagonal tuple by the double induction on n 
and I. It is convenient to let t > 0. For n = 1 the theorem trivially holds 
for any £ > 0. For I = the theorem trivially holds for any n > 1. Assume 
now that p > 1, q > 1 and assume that the theorem holds for n < p — 1 
and all I and for n — p and ^ < q — 1. Assume that A ,...,A 9 e C pxp 
are q + 1 commuting diagonable matrices. Suppose first that A = al p . 
The induction hypothesis yields that (Bi,...,B q ) = P(A\, A q )P^ 1 is a 
diagonal q-tuple for some P € GL(n, C). As PAqP^ 1 = A = al p we 
deduce that {A ,B 1 , ...,B t ) = P{A ,A 1 , ...,A e )P~ 1 . 

Assume that A is not a scalar matrix, i.c A ^ ^ tr A I p . Let 

A = QAqQ- 1 = ®i =1 aiI Pi , 

k 

1 < Pi, <H ^ aj for i ^ j, i,j = 1, fe, ^ = p. 

»=i 

Then the q + 1 tuple (A , A 9 ) = Q(A , j4 9 )Q _1 is a q + 1 tuple of di- 
agonable commuting matrices. The specific form of Ao and the assumption 
that Aq and Aj commute implies 

Aj = ®i—i-Aj t i, Aj,i € C Pl Pl , i = 1, fc, j = 1, 

The assumption that (Aq, Aq) is a q + 1 tuple of diagonable commut- 
ing matrices implies that each % the tuple (a,iI Pi , Al i..., A q i ) is q + 1 tuple 
of diagonable commuting matrices. Hence the induction hypothesis yields 
that (aiI Pi ,Ai vi ...,A q>i ) is similar to a q + 1 diagonal tuple for i = l,...,k. 
It follows straightforward that (Ao, Ai..., A q ) is simultaneously similar to a 
diagonal q+1 tuple. □ 

The problem when A(x) € C[x] nxn is strictly similar to an upper tri- 
angular matrix B(x) G C[x]™ x " is equivalent to the problem when an 1+ 1 
tuple (A Q ,...,A e ) e (C nxn ) i+1 is simultaneously an upper triangular tu- 
ple (B , Bg), i.e. each Bi is an upper triangular matrix, is solved in 
[DDG51]. We bring their result without a proof. 

Definition 3.11.5 Let D be a domain, let n,m be positive integers and 
let C u ...,C m € D" x ". Then A{C 1 ,...,C m ) C D" xn denotes the minimal 
algebra in D nx ™ containing I n and Ci,...,C m . That is every matrix F e 
A(C 1 , Cm) is a noncommutative polynomial in Ci, C m . 
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Theorem 3.11.6 Let m,£ be positive integers and let A , At e (C" xn ) £+1 . 
TFAE: 

(a) (A , At) is simultaneously similar to an upper triangular tuple (B , Bf) G 
M n (C)^ +1 . 

(b) For any < i < j < £ and F <G A(A , At)) the matrix (AiAj - 
AjAi)F is nilpotent. 

The implication (a) =4> (6) is trivial. (See Problem 2.) The verification 
of condition (b) can be done quite efficiently. (See Problem 3.) 

Corollary 3.11.7 Let m, £ be positive integers and assume that Aq, At G 
C nx " are commuting matrices. Then (Aq, At) is simultaneously similar 
to an upper triangular tuple (B ,-..,Bt). 

See Problem 4. 
Problems 

1. Let F be a field. View F" x ™ as an n 2 dimensional vector space over 
F. Note that any A e F" x ™ acts as a linear transformation on F" x " 
by left multiplication: B ^ AB, B e C nxn . Let A Q ,...,A e e F" xn . 
Let W = span (I n ) and define 

£ 

W^Wh + ^^Wh, k = l,...,. 

3=0 

Show that Wfc_! C Wj. for each k > 1. Let p be the minimal non- 
negative integer for which the equality Wj. = Wfe +1 holds. Show 
that A(A , At) = W p . In particular A(A , At) is a finite di- 
mensional subspace of F nx ™. 

2. Show the implication (a) (b) in Theorem 3.11.6. 

3. Let the assumptions of Problem 1 hold. Let X = A(A , At) and 
define recursively 

X fe = ]T (AiAj — A,Ai)X fe _i c F" x ™, k=l,...,. 
o<Kj<e 

Show that the condition (a) of Theorem 3.11.6 to the following two 
conditions: 

(c) ^X fc cX fc) i = 0,...,£, k = 0,...,. 

(d) There exists q > 1 such that X 9 = {0} and X fe is a strict subspace 
of Xfc-i for k = 1, q. 
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4. Let A ,...,A e £ ¥ nxn . Assume that / x e P and A x = A x. 
Suppose that A Ai = AiA 0} i = 1, ...,£. 

(a) Show that any nonzero vector in A(A 1 , yLj)span (x)(D span (x)) 
is an eigenvector of Aq corresponding Ao- 

(b) Assume in addition that A\, ...,A^ are commuting matrices whose 
characteristic polynomials split in F to linear factors. Show by induc- 
tion that there exists ^ y £ A(A ± , A^)span (x) such Aty = 
\ iy , i = 0,...,l 

(c) Show that if Aq,...,Ai e jpnx« are commuting matrices whose 
characteristic polynomials split in F to linear factors then (Ao, ...,AA 
is simultaneously similar over GL(ra,F) to an upper triangular £+1 
tuple. 

3.12 Similarity to diagonal matrices 

Let A(x) £ Hq X ™. The Weierstrass preparation theorem (Theorem 1.7.4) 
implies that the eigenvalues of A(y s ) are analytic in y for some s\n\. That 
is the eigenvalues Xi(x), A„(x) are multivalued analytic functions in x 
which have the expansion 

oo 

= X! X 3k x7 > j = l,...,n. 

k=0 

In particular each Aj has Sj branches, where Sj\m. For more properties of 
the eigenvalues Xi(x), X n (x) see for example [Kat80, Chap. 2]. 
Let A(x) £ C[x] nxn . Then 

£ 

(3.12.1) A(x) = ^A k x k , A k £ C" x ", jfe = 0, ...,£. 

The eigenvalues of A(x) satisfy the equation 

n 

det (XI - A(x)) = X n + ^a 3 (x)X n - j , aj (x) £ C[x], j = 1, ...,n. 

3 = 1 

(3.12.2) 

Thus the eigenvalues Xi(x), X n (x) are algebraic functions of x. (Sec for 
example [GuR65].) For each ( £ C we apply the Weierstrass preparation 
theorem in to obtain the Puiseaux expansion of Xj(x) around x = (: 

oo 

(3.12.3) A j (a;) = 2A jt (C)(it-C) i , j = l,...,n. 

k=0 
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For simplicity of notation we choose s <n\ for which the above expansion 
holds for each ( e C. (For example s = n\ is always a valid choice.) Since 
A(x) is a polynomial matrix each Xj(x) has Puiseaux expansion at oo. Let 

i 

A(x) = x e B(-), B(y) = Y / A k /- k . 
x k=o 

Then the Puiseaux expansion of the eigenvalues of B(y) at y — yields 

oo 

(3.12.4) Xj(x) = x e Y^ Aj fc (oo)a;->, j = l,...,n. 

fc=0 

Equivalently, we view the eigenvalues Xj(x) as multivalued analytic func- 
tions over the Riemann sphere P = C U oo. To view A (x) as a matrix 
function over P we need to homogenize as in §2.1. 

Definition 3.12.1 Let A(x) be given by (3.12.1). Denote by A(x a ,xi) 
the corresponding homogeneous matrix 

l 1 

(3.12.5) A(x , Xl ) = A k exi- k x\ e C[x , Xl ] nxn , 

where £' = -1 if A(x) = and A v ^ and Aj = for £' < j < £ if 
A(x) + 0. 

Let A(x),B(x) S C[a;] nxn . Then A(x) and B(x) are similar over C[x], 
denoted by A(x) w B(x), if B(x) = P(x)A(x)P' 1 (x) for some P(x) E 
GL(n,C[x]). Lemma 2.9.4 implies that if A(x) « B(x) then the three 
matrices in (3.6.2) are equivalent over C[x]. Assume a stronger condition 

AksB. Clearly if B(x) = PA(x)p- 1 then B{x ,x{) = PA(x , x^P' 1 . 
According to Lemma 2.9.4 the matrices 

(3.1MA(x , xx) - A(x , xi) T <g> I, I <g> A(x , xi) - B(x , xi) T <g> I, 

I ® £?(x , xi) - B(xo, xi) T <g> /, 

are equivalent over C[xo,xi]. Lemma 1.11.3 yields. 

Lemma 3.12.2 Let A(x),B(x) e C[x] nxn . Assume that A(x)&B(x). 
Then the three matrices in (3.12.6) have the same invariant polynomials 
over C[xo, x\]. 

Definition 3.12.3 Let A(x), B(x) e C[x] nxn . Let A(x ,x 1 ),B(x Q ,x 1 ) 
be the homogeneous matrices corresponding to A(x),B(x) respectively. De- 
note by ih(A,B, Xo,Xi), k = l,...,r(A,B) the invariant factors of I ® 
A(x , xx) - B(xo, xi) T ® /. 
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The arguments of the proof of Lemma 2.1.2 imply that ik(A, B, xo, x-\) 
is a homogeneous polynomial for k — 1, r(A, B). Moreover ik(A 7 B,l,x) 
are the invariants factors of / ® A(x) — B(x) T ® /. (See Problems 5-6.) 

Theorem 3.12.4 Let A(x) G C[x] nxn . Assume that the characteristic 
polynomial of A(x) splits to linear factors over C[x]. Let B(x) be the di- 
agonal matrix of the form (3.10.1). Then A(x) w B{x) if and only if the 
three matrices in (3.6.2) are equivalent overC[x]. Furthermore A(x):vB(x) 
if and only if the three matrices in (1.34.8) have the same invariant factors 
over C[xo, x{\. 

Proof. Clearly if A(x) « B(x) then the three matrices in (3.6.2) are 
equivalent over C[x]. Similarly if A(x)mB(x) then the three matrices in 
(1.34.8) have the same invariant factors over C[x ,xi]. We now show the 
opposite implications. 

Without loss of generality we may assume that B{x) is of the form 



Assume first that A(x) ~ B(x). Let Pj(A) be the projection of A(x) on 
Xj(x) for j = 1, to. Suppose that (3.12.8) is satisfied at £. Problem 3.4.10 
yields that each Pj(x) is analytic in the neighborhood of (. Assume that 
(3.12.8) does not hold for (eC. The assumptions that the three matrices 
in (3.12.6) have the same invariant polynomials imply that the matrices in 
(3.6.2) are equivalent over H^. Now use Theorem 3.10.1 to get that A(x) = 
Q{x)B(x)Q{x)- 1 , Q e GL(n,H c ). Clearly Pj(B), the projection of B(x) 
on Xj(x), is 0(Bl nj 00. In particular Pj(B) is analytic in the neighborhood 
of any (eC and its rank is always equal to nj. Problem 3.4.11 yields that 
Pj(A)(x) = Q(x)P 3 (B)(x)Q(x)- 1 e H J c lxn . Hence rankPj(A)(C) = nj for 
all ( e C. Furthermore Pj(A)(x) e H^ x ™, i.e. each entry of Pj{A) is an 
entire function (analytic function on C). Problem 3.4.14 yields that 



(3.12.7) 



Thus for all but a finite number of points (eCwe have that 



(3.12.8) 



MO + x j(0 for * + .L i,j = 1, 



TO. 



n 



A(Q-X k (QI 

XjiO-XkiQ 1 



(3.12.9) 



^•^)(o= n 



j = 1, ...,n. 



Hence each entry of Pj(A)(() is a rational function of ( on C. Since 
Pi(A)(x) is analytic in the neighborhood of each ( e C it follows that 
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Pi(x) G C[x]" xn . We also showed that its rank is locally constant, hence 
rankPi(x) = n;, i = l,...,m. Therefore the Smith normal form of Pi(x) 
over C[a;] is P t {x) = Ui(x)(I n( 0)^(x), U u V, € GL(n, C[x]). Let 
Ui t i(x), ...,u nu i(x) be the first columns of Ui(x). Then P,(a;)C™ = 
span (ui i i(x), u nij ;(x)). Recall that P\{x) + ... + P m (x) = I n . Hence 
Ui i i(a;), u ni! i(x), Ui jTO (x), u nmim (i) is a basis for C" for each x G 
C. Let S(x) be the matrix with the columns 

Ui,iO), ...,u„ ul (x), ... 

1 U-l 777, (*),... (x). 

Then S(a;) G GL(n,C[z]). Let = S- 1 (x)A(x)S(x) G C[x] nxn . Since 

A(x) is pointwise diagonable -D(C) = B(Q, where ( satisfies (3.12.8) and 
B(x) is of the form (3.12.7). Since only finite number of points ( G C do 
not satisfy the condition (3.12.7) it follows that D{x) = B{x). This proves 
the first part of the theorem. 

Assume now that the three matrices in (1.34.8) have the same invariant 
factors over C[a;o,Xi]. The same arguments imply that A(xq, l)SiB(x , 1) 
over the ring Hq. That is Pj(A) is also analytic at the neighborhood £ = oo. 
So Pj(A) is analytic on P hence bounded, i.e. each entry of Pj(A) is 
bounded. Hence Pj(A) is a constant matrix. Therefore S(x) is a con- 
stant invertible matrix, i.e. A(x)^B(x). □ 

Let A(x) € C[x] nxn be of the form (3.12.1) with £ > 1 and A e ^ 0. 
Assume that A(x) is strictly similar to a diagonal matrix B{x). Then 
A{x) is pointwise diagonable, i.e. A{x) is similar to a diagonal matrix 
for each x e C, and Ae ^ is diagonable. Equivalcntly, consider the 
homogeneous polynomial matrix A(xq,Xi). Then A(xq,xi) is pointwise 
diagonable (in C 2 ). However the assumption that any A(xo, Xi) is pointwise 
diagonable does not imply that A(x) is strictly equivalent to a diagonal 
matrix. Consider for example 

(3.12.10) A(x) = 

(See Problem 2.) 

Definition 3.12.5 Let A(x) e C[x] nxn be of the form (3.12.1) with 
£ > 1 and Ag ^ 0. Let X p (x) and X q (x) be two distinct eigenvalues of 
A(x). (X p (x) and X q (x) have distinct Puiseaux expansion for any ( G ¥.) 
The eigenvalues X p (x) and X q (x) are said to be tangent at £ € P if their 
Puiseaux expansion at ( satisfy 



A(x ,x 1 ) 



X\ XqXy 

x\ + x 



(3.12.11) A pfe (C) = X qk (0, k = 0,...,s. 
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(Note that two distinct eigenvalues are tangent at oo if the corresponding 
eigenvalues of A(x, 1) are tangent at 0.) 

Note that for A(x) given in (3.12.10) the two eigenvalues of A(x) x 2 and 
1 + x 2 are tangent at one point ( = oo. (The eigenvalues of A(x, 1) are 1 
and 1 + x 2 .) 

Theorem 3.12.6 Let A(x) e C[x] nxn be of the form (3.12.1) with 
£ > 1 and At ^ 0. Then one of the following conditions imply that 
A(x) = S(x)B(x)S- 1 (x), where S(x) e GL(n,C[x]) and B(x) e C[x] nxn 
is a diagonal matrix of the form Y^iL\ \{ x )Iki, where k\,...,k m > 1. 
Furthermore \\(x), X m (x) are to distinct polynomials satisfying the fol- 
lowing conditions: 
(a) I > deg Xi(x), i = 1, to. 

(6) The polynomial A, (a;) — Aj(a;) /las oni?/ simple roots in C /or i ^ j. 
(A i (C) = A,(C)^AKC)^A;.(C)). 

/. TTie characteristic polynomial of A(x) splits in C[x], i.e. all the 
eigenvalues of A(x) are polynomials. A(x) is point-wise diagonable in C 
and no two distinct eigenvalues are tangent at any (eC . 

//. A(x) is point-wise diagonable in C and An is diagonable. No two 
distinct eigenvalues are tangent at any point ( £ CU {oo}. Then A{x) is 
strictly similar to B(x), i.e. S(x) can be chosen in GL(n,C). Furthermore 
X\(x), X m (x) satisfy the additional condition: 

(c) deg Xi(x) — I. Furthermore, for i ^ j either ^r(O) ^ (0) or 
fe(0) = ^(0)anrf ££( )^££(0). 

Proof. View A(x) as matrix in M nxn , where M is field of rational 
functions. Let K be a finite extension of M. such that det (XI — A(x)) splits 
to linear factors over K. Then A(x) has m distinct eigenvalues Ai, A TO G 
K of multiplicities n\,...,n m respectively. We view these eigenvalues as 
multivalued functions X\[x), X m (x). Thus for all but a finite number 
of points C (3.12.8) holds. Assume that £ satisfies (3.12.8). Denote by 
Pj(() the projection of A(() on Xj((). Problem 10 implies that Pj(x) 
is a multivalued analytic in the neighborhood of £ and rank Pj(C) = Tij. 
Problem 14 yields (3.12.9). We claim that in the neighborhood of any ( e C 
each Xj and Pj is multivalued analytic and rank Pj (x) — nj . Let £ e C for 
which (3.12.8) is violated. For simplicity of notation we consider Xi(x) and 
Pi(x). Let 



Ai(C) = ... = A r (C) ? A fc (C), k = r + l, • • • 7 to. 
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Theorem 3.5.7 implies the existence of Q(x) € GL(n, Hf) such that 

Q-\x)A{x)Q{x) = d(x) © C 2 (x), 

r 

Cj(x) e H™ 3Xmj , j = 1, 2, mi = rij, m 2 = n- mi. 

i=i 

The eigenvalues of Ci(x) and C 2 (x) are Ai(x), A r (x) and A r +i(x), A TO (x) 
respectively in some neighborhood of £. Since C(x) is pointwise diagonable 
in Hf it follows that Ci(x) and C 2 (x) are pointwise diagonable in H^. We 
claim that A,(x) e H^, the projection p(x) of Ci(x) on Aj(x) is in jj™ lXmi 
and rank Pj(C) = for i = l,...,r. If r = 1 Ai(x) = ^-trCi(x) e H^ 

and A(x) = I ni . Assume that r > 1. Since Ci(£) is diagonable and has 
one eigenvalue Ai(() of multiplicity mi it follows that Ci(C) = Ai(£)/ mi . 
Hence 

d(x) = Ai(C)/ roi + (x - C)C'i(x), Ci(x) e H™ lXmi . 

Clearly (7i(x) has r distinct eigenvalues Ai(x), A r (x) such that 

Aj(x) = Ai«) + (x - C)Aj(x), i = 1, r. 

Each A,(x) has Puiseaux expansion (3.12.3). The above equality shows 
that for 1 < i < j < r A,(x) and Xj(x) arc not tangent if and only if 
Ai(C) Aj(r?). By the assumption of theorem no two different eigenval- 
ues of A(x) are tangent in C. Hence Aj(£) ^ Aj(ry) for all i ^ j < r. 
That is Ci(C) has r distinct eigenvalues. Apply Theorem 3.5.7 to C\(Q 
to deduce that C*i(C) is analytically similar C\ © ... © (7 r such that Q 
has a unique eigenvalues Aj(x) of multiplicity rii for i = 1, ...,r. Hence 
Aj(x) = — trCj(x) € H^ =>• Aj(x) € H^. Clearly the projection of Cj(x) 
on Aj(x) is I n .. Hence Pj(x) is analytically similar to the projection to 
©...©/„.... ©0. So Pi(x) e H™ lXmi , rank Pj (a;) = ra* for i = l,..,r. 
Hence Pi(x) G H^ x ", rank Pi(x) = n\ as we claimed. 

Assume now that Ai(x), . . . , A„(x) are polynomials. Hence and Pj(x) are 
entire functions on C. (Sec for example [Rud74].) Since limui^oo —^P~ = ^-2 
it follows that limsup| a .|_ >00 ^'^^ < where is the spectral radius 

of A^. Hence each Aj(x) is polynomial of degree I at most. Since Ag ^ 
it follows that at least one of Xj(x) is a polynomial of degree I exactly. 
We may assume that deg Ai(x) = I. This proves the condition (a) of the 
theorem. The condition (b) is equivalent to the statement that no two 
distinct eigenvalues of A(x) are tangent in C. 
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Define Pj(C) by (3.12.9). As in the proof of Theorem 3.12.4 it follows 
that Pi(x) <G C[x] nx ™ and rank Pi(x) = n i} i = 1, ...,m. Furthermore we 
define S(x) € GL(n, C[x]) as in the proof of Theorem 3.12.4 such that 
B(x) = S~ 1 (x)A(x)S(x). This proves the first part of the theorem. 

To prove the second part of the theorem observe that in view of our 
definition of tangency at oo the condition (c) is equivalent to the condition 
that no two distinct eigenvalues of A are tangent at infinity. Assume now 
that Ai is diagonable and no two distinct eigenvalues arc tangent at oo. 
Then the above arguments show that each Pi (x) is also multivalued analytic 
at oo. By considering x~ l A(x) it follows that Pi(x) is bounded at the 
neighborhood of oo. Hence Pi(x) = Pi(0) for i = 1, ...,m. Thus S G 
GL(n, C). So A(x) is diagonable by a constant matrix. In particular all 
the eigenvalues of A(x) are polynomials. Sice no two distinct eigenvalues 
are tangent at oo we deduce the condition (c) holds. □ 

Problems 

1. Let A{x) e C[x]" x ™. Assume that there exists an infinite sequence 
of distinct points {Cfc}i° such that A((k) is diagonable for k = 1, 
Show that A(x) is diagonable for all but a finite number of points. 
(Hint: Consider the rational canonical form of A(x) over the field of 
rational functions C(x).) 

2. Consider the matrix A(x) given in (3.12.10). Show 

(a) A(x) and A(xq,x\) are pointwise similar to diagonal matrices in 
C and C 2 respectively. 

(b) The eigenvalues of A(x) are not tangent at any point in C. 

(c) Find S(x) G GL( 2 , C[x]) such that S- 1 (x)A(x)S(x) = diag(x 2 , 1+ 
x 2 ). 

(d) Show that A(x) is not strictly similar to diag(x 2 , 1 + x 2 ). 
(c) Show that the eigenvalues of A(x) are tangent at ( = oo. 

3.13 Property L 

In this section and the next one we assume that all pencils A(x) = Aq + Aix 
are square pencils, i.e. A(x) € C[x]™ x ™, and A\ ^ unless stated otherwise. 
Then A(xq, xi) = A n x + A\X\. 
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Definition 3.13.1 A pencil A(x) <G C[x]™ x " has property L if all the 
eigenvalues of A(xq,x\) are linear functions. That is \i(xo,Xi) = atx + 
PiXi is an eigenvalue of A{xq,X\) of multiplicity rii for i — 1, ...,m, where 



The proofs of the following propositions is left to the reader. (See Problems 



Proposition 3.13.2 Let A(x) = A + xA x be a pencil in C[x] nxn . 
TFAE: 

(a) A{x) has property L. 

(b) The eigenvalues of A{x) are polynomials of degree 1 at most. 

(c) The characteristic polynomial of A(x) splits to linear factors over C[x]. 

(d) There is an ordering of the eigenvalues of Aq and A\, a\,...,a n and 
bi,..., b n , respectively, such that the eigenvalues of A x + A x x x are aix + 
hxi, a n x + b n xi. 

Proposition 3.13.3 Let A{x) be a pencil in C[a;]™ x ". Then A(x) has 
property L if one of the following conditions hold: 

(a) A(x) is similar overC(x) to an upper triangular matrix U(x) £ C(x) nxn . 

(b) A(x) is strictly similar to an upper triangular pencil U{x) = U a + U\X, 
i.e. Uo,U\ are upper triangular. 

(c) A(x) is similar over C[x] to a diagonal matrix B(x) G C[x]™ x ". 

(d) A(x) is strictly similar to diagonal pencil. 

Note that for pencils with property L any two distinct eigenvalues are 
not tangent at any point of P. For pencils one can significantly improve 
Theorem 3.12.6. 

Theorem 3.13.4 Let A(x) = A + A x x e C[a;]" xn be a nonconstant 
pencil (A\ ^ 0). Assume that A[x) is pointwise diagonable on C. Then 
A(x) has property L. Furthermore A(x) is similar over C[x] to a diagonal 
pencil B{x) ~ Bq + B\x. Suppose furthermore that A\ is diagonable, i.e. 
A(xq, x\) is pointwise diagonable on C 2 . Then A{x) is strictly similar to the 
diagonal pencil B(x) , i.e. A and Ai are commuting diagonable matrices. 

Proof. We follow the proof of Theorem 3.12.6. Let Ai(x), X m (x) be 
the eigenvalues of A(x) of multiplicities n\, ...,n m respectively where each 
Xj(x) is viewed as multivalued function of x. More precisely, there exists 



rn 




1-2.) 
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an irreducible polynomial 

p 

4>(x, \) = x p + J2 <t> q {x)\ p - q e C[x, A], 

(3.13.1) 

cp(x,X)\det (XI-A(x)), 
such that Xj(x) satisfies the algebraic equation 
(3.13.2) <j>(x,\)=0. 

Moreover all branches generated by Xj(x) on C will generate all the solu- 
tions X(x) of (3.13.2). Equivalcntly all pairs (x, A) satisfying (3.13.2) form 
an affine algebraic variety Vq C C 2 . If we compactify Vb to a projective 
variety V C P 2 then V is a compact Riemann surface. V\Vq consists of a 
finite number of points, the points of Vb at infinity. The compactification 
of Vo is equivalent to considering Xj(x) as a multivalued function on P. 
See for example [GuR65]. Note that any local solution of (3.13.2) is some 
eigenvalue Aj(x) of A(x). Since A(£) is diagonablc at ( e C Theorem 3.8.1 
implies that the Puiseaux expansion of Aj(x) around ( in (3.12.3) is of the 
form 

oo 

x j (x) = x j (o+ y>jfc(0(*-0'- 



Then 



dXj(x) k 
dx ^ s /,J 



x jk (0(x-CY 



So rfA ^ rr - > is a multivalued locally bounded function on C. Equivalently, 
using the fact that Aj(x) satisfy (3.13.2) we deduce 



d<f>(x,X) 

(3.13.3) 



dXj(x) _ — — 



dx d<t>(x,\) • 

dy 

Hence is a rational function on V, which is analytic on Vb in view 

of the assumption that A(x) is pointwise diagonablc in C. The Puiseaux 
expansion of Xj(x) it oo (3.12.4) is 



A,,'(x) = x 2, Xjk(oo)x 



fe=0 
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Hence 

dXAx) \^ s ~k , s -h 

— — = X j0 (oo) + — ^— X jk (oo)x 

fc=l 

That is the multivalued function is bounded at the neighborhood of 

oo. Equivalently the rational function in (3.13.3) is bounded at all points 
of V\Vq. Thus the rational function in (3.13.3) is bounded on a compact 
Ricmann surface (3.13.2). Hence it must be constant, i.e. = bj => 

Xj(x) = aj + bjX. So we have property L by part (b) of Proposition 3.13.2. 
In particular two distinct eigenvalues of A(x) are not tangent at any (eP. 
The first part of Theorem 3.12.6 implies that A(x) is similar to B(x) = 
YJj=i ®( a j + bjx)I nj over C[a;]. 

Assume now that A\ is diagonablc. Then the second part of Theorem 
3.12.6 yields that A[x) is strictly similar to B(x), which is equivalent to 
the assumption that Aq,Ai are commuting diagonable matrices (Theorem 
3.11.4). □ 



Theorem 3.13.5 Let A(x) = A + A x x e C[x] nxn . Assume that A x 
andA 2 are diagonable and A Ai ^ AiA . Then exactly one of the following 
conditions hold: 

(a) A{x) is not diagonable exactly at the points £i,...,(p, where 1 < p < 
n(n — 1). 

(6) A(x) is diagonable exactly at the points Ci = 0, Q for some q > 1. 

Proof. Combine the assumptions of the theorem with Theorem 3.13.4 
to deduce the existence of ^ ( e C such that A(Q is not diagonable. 
Consider the homogenized pencil A(xo,x\) = Ac,x + A\X\. Let 

C(p 1 ,...,p k )(x ,x 1 ) = ® k J=1 C( Pj ) e C[a;o,a;i] nx ™, 
fe 

JJpi(a;o, x 1: X) = det (XI - A(x , xi)), 

i=l 

mi 

p i (x ,x 1 ,X) = X m > +^2x mi - j p ij (x) e C[a; ,a;i][A],l <m, i= l,...,k, 
Pi\Pi\-\Pk, 

be the rational canonical form A(a;o,a;i) over C(xo,xi). (See 2.3.) That is 
each pi(xo,Xi, X) is a nontrivial invariant polynomial of XI — A(xo,Xi). 
Hence each Pi(xo, x\, X) is a homogeneous polynomial of degree in 
x ,x 1 ,X. Furthermore A(a; ,a;i) = S(x , x 1 ))C(p l , ...,p k )(xQ,x- l )S(x ,x 1 )~ 1 
for some 
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S(xq, xi) e C[xq, x 1 ] nxn n GL(n, C(x D , a^)). Choose x = r ^ such that 
det S(t,Xi) is not identically zero in x\. Then A(x) = A(l,x) is pointwise 
similar to 

^C(pi, ...,pk)(r,Tx) at all point for which det S(t,tx) ^ 0, i.e. at all but 
a finite number of points in C. 

Since C[a;o,a;i,A] is D u then Pk{xo, x\, A) = T\ r i=1 (j)i(x 07 xi, X) £i , where 
each fa is a nonconstant irreducible (homogeneous) polynomial and <^ is 
coprime with <j)j for i ^ j. Assume first that some £i > 1. Then C(pk)(T,t) 
has a multiple eigenvalue for any t £ C, hence it is not diagonable. That is 
the condition (b) of the theorem holds. 

Assume now that l\ — ... = £ r = 1. This is equivalent to the assump- 
tion that PkixQ, X\, A) = does not have multiple roots for some (xo,xi). 
We claim that it is possible to choose t^O such that Pfc(r, x\, A) has 
pairwise distinct roots (in A) except in the points Ci>— >Cg- Consider the 
discriminant D(x ,xi) of Pk{%o, x \. A) € C[a;o, £i][A]. See 1.8. Since p ki 
is a homogeneous polynomial of degree i for i = l,...,rrik it follows that 
D(xq,xi) is homogeneous polynomial of degree mk(mk — 1) < n(n — 1). 
Since p/s(a;o, Xi, A) = does not have multiple roots for some (xo,x\) it 
follows that D(xo, x\) is not a zero polynomial, and Pk(xo, xi,X) = has a 
multiple root if and only if D(x a .x l ) = 0. Choose t/0 such that D(t, Xi) 
is not a zero polynomial. Let £i, ( q be the distinct roots of D(t, tx) = 0. 
Since the degree of D(x , x\) is at most n(n — 1) it follows that the degree 
of D(t,x) is at most n(n — 1). Hence < q < n(n — 1). By the defini- 
tion of the invariant polynomials it follows that Pk(xo, x\, A(xq 1 x\)) = 0. 
Hence Pk(i~, rt, A(t, rt)) = 0. Let t £ X = C\{£i, ( q }. Since Pfc(r, rt, A) 
has rnfc distinct roots, which are all eigenvalues of A(t, rt) it follows that 
A(t, rt) = rA(t) is a diagonable matrix. □ 

For n = 2 the case (b) in Theorem 3.13.5 does not arise. See Problem 
4. We do not know if the case (b) of Theorem 3.13.5 arises. Recall that a 
hermitian matrix A <G C nx ™, A T — A is always diagonable. 

Definition 3.13.6 A pencil A(x) = A + A\x is called hermitian if 
Ai,A 2 G C" x ™ are hermitian. 

Theorem 3.13.7 Let A(x) = A + A x x e C[x] nx ™ be a hermitian pen- 
cil. Assume that A$Ai ^ A\Aq. Then there exists 2q distinct complex 
points Ci, Ci---> Cg> Cq £ C\R, 1 < q < " ( -"~ 1 ' > such that A(x) is not diago- 
nable if and only if x e {Ci,Ci, — ,C«>Cg}- 

Proof. Clearly A(x) is a hermitian matrix for any real x. Hence A(x) 
is diagonable for x e R. Thus the condition (a) of Theorem 3.13.5 holds. 
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Assume that A(Q is not diagonable. Then C e C\R. Let A(() = QJQ^ 1 , 
where J is a Jordan canonical form of A((). Then 

^(C) = ^o T = (g T )- 1 J T Q T . 

Hence A(Q is not diagonable. Thus the number of distinct points £ for 
which A(() is not diagonable is 1 < 2q < n(n — 1). □ 

The points Ci,---,C? are called the resonance states of the hermitian 
pencil A(x). They are important in certain chemical models [M0F8O]. 

Problems 

1. Prove Proposition 3.13.2. 

2. (a) Show that property L is equivalent to the condition (a) of Propo- 
sition 3.13.3. 

(b) Prove the other conditions of Proposition 3.13.3. 

3. Show that a pencil A(x) = A a + A\X £ C[x] 2x2 have property L if 
and only if A(x) is strictly similar to an upper triangular pencil. 

4. Let j 4(.t ,.xi) = A n x + A\X\ e C[x ,xi] 2x2 . Then exactly one the 
following conditions hold. 

(a) A(x ,xi) is strictly similar to a diagonal pencil. (Property L 
holds). 

(b) A(x , x\) is not diagonable except exactly for the points (x , x\) 7^ 
(0, 0) lying on a line axo + bx\ — 0. (Property L holds, A Ai = AiA n 
but Ai is not diagonable for some i e {1,2}, A{xq,X\) has a double 
eigenvalue.) 

(c) A(xq,xi) is diagonable except exactly for the points (xq,Xi) =/= 
(0, 0) lying on a line axo + bx\ = 0. (Property L holds.) 

(d) ^4(20,^1) is diagonable except exactly the points (xo,Xi) ^ (0,0) 
which lie on two distinct lines in C 2 . (Property L does not hold.) 

5. Let 





"0 


1 


0" 
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A a = 








1 


, A 1 = 


1 


1 


2 















-1 


-1 


-2 



(a) Show that A , A\ are nilpotent while A + A\ is nonsingular. 

(b) Show that A(x) = A a + Aix does not have property L. 

(c) Show that A(x) is diagonable for all x 7^ 0. 
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3.14 Strict similarity of pencils and analytic 
similarity 

Let A(x) = A + A 1 x,B(x) = B + B x x £ C[x] nxn . Recall the notion of 
strict equivalence A(x)~B(x) (2.1) and strict similarity A(x)~B(x) (3.11). 
Clearly A(x)^B(x) A(x)~B(x). (2.9.3) yields. 

Proposition 3.14.1 Let A(x),B(x) £ C[x]" x " be two strictly similar 
pencils. Then the three pencils in (3.6.2) are strictly equivalent. 

Using Kronecker's result (Theorem 2.1.7) we can determine if the three 
pencils in (3.6.2) are strictly equivalent. We now study the implications of 
Proposition 3.14.1. 

Lemma 3.14.2 Let A(x) = A a + A x x, B(x) = B + B x x £ C[x] nxn be 
two pencils such that 

(3.14.1) I <g> A(x) - A(x) T <g> I~I <g> A(x) - B{x) T <g> I. 
Then there exists two nonzero U, V € C nXTl such that 

(3.14.2) A(x)U-UB(x) = 0, VA(x) - B(x)V = 0. 
In particular 

(3. 14.3) A kcr V, A x kcr V C kcr V, B kcr U, B x kcr U C ker U. 

Proof. As A(x)I - IA(x) = it follows that the kernels of I (g> A(x) - 
A(x) T ®I £ C[x] n xn and its transpose contain a nonzero vector I n £ C™ 
which is induced by /„. (See 2.8.) Hence the kernel of 7(g) A(x) — B(x) T ®7 
contain nonzero constant vectors. This is equivalent to (3.14.2). 

Assume that (3.14.2) holds. Let x £ kcrV. Multiply the second equal- 
ity in (3.14.2) from the right by x to deduce the first part (3.14.3). The 
second part of (3.14.3) is obtained similarly. □ 

Definition 3.14.3 A ,Ai £ C nxn have a common invariant subspace 
if there exist a subspace U C C™, 1 < dim U < n— 1 such that A TJ, AiU C 
U. 

The following claims are left to the reader (see Problems 1-2) . 
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Proposition 3.14.4 Let A(x) = A + xA l £ C[x] nxn . Then A(x) is 
strictly similar to an upper triangular pencil 



B(x) 



'B n (x) B 12 (x) 
B 22 (x)\ ' 

(3.14.4) B n (x) £ C[x] niXn \ B 12 {x) £ C[x]" lX ™ 2 , B 22 (x) £ C[z]" 2X " 2 , 
1 < ni,n 2 , ni+n 2 = n, 



if and only if A ,Ai have a common invariant subspace. 

Proposition 3.14.5 Assume that A(x) £ C[x] nxn is similar overC(x) 
to an upper triangular matrix B(x) of the form (3.14.4). Then det (XI — 
A(x)) £ C[x, X] is reducible. 

Theorem 3.14.6 Let A(x) = A + A x x, B(x) = B + B x x £ C[x] nxn . 
Assume that either det (XI — A(x)) or dct (XI — B(x)) is irreducible over 

C[x, A]. Then A(x)&B(x) if and only if (3.14.1) holds. 

Proof. Assume that (3.14.1) holds. Suppose that det (XI — A(x)) is 
irreducible. Propositions 3.14.4-3.14.5 imply that ^cb^i do not have a 
common invariant subspace. Lemma 3.14.2 implies that the matrix V in 
(3.14.2) is invertible, i.e. B(x) = VA(x)V~ 1 . Similarly if dct (XI - B(x)) 
is irreducible then B(x) = U^ 1 A(x)U. □ 



Definition 3.14.7 Let J„ C (C nx ™) 2 be the set of all pairs (A ,Ai) 
such that det (XI — (A + A x x)) irreducible. 

We will show later that I n = (C nxn ) 2 \X n where X n is a strict subvariety 
of (C nxn ) 2 . That is, for most of the pencils A(x),(A ,A 1 ) £ (C nx ™) 2 
det (XI — A(x)) is irreducible. Clearly if (A , Ai)fa(B 0} Bi) then either 
(A Q , Ax), (B q , B x ) £ T n or (A , A x ), (B , Bi) ^ T n . 

Corollary 3.14.8 Let (A , A x ), (B , Si) £ I n . Then A(x) = A a + A x x 
is strictly similar to B(x) = B + B\x if and only if (3.14.1) holds. 

We now discuss the connection between the notion of analytic similarity 
of matrices over H and strict similarity of pencils. Let A(x), B(x) £ Hq X ™ 

r 

and assume that r](A, A) = 1. Suppose that AtxB(x). Theorem 3.9.2 
claims that A(x) « B(x) if and only if there exists two matrices T £ 
GL(n,C), T ± £ C nxn such that 



A T = T B , A 1 To + A T 1 =T B 1 +T 1 B . 
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Let 



(3.14.5) 



F(A ,A 1 ) = 



A A 1 
A 



Then (3.9.2) is equivalent in this case to 



(3.14.6) 



F(A ,A 1 



)F(T ,T 1 ) = F{T ,T 1 )F{B ,B 1 ). 



As det F(T ,Ti) = (det T ) 2 it follows that T is invertible if and only if 
F(T ,Ti) is invertible. 

Definition 3.14.9 Le£ A ,A 1 ,B ,B 1 e C" x ™. Tften F(A ,^i) and 
F(B ,Bi) are called strongly similar (F(A ,Ai) = i^^o^i)) if there ex- 
ists F(T ,7i) e GL(2n,C) sucft tftai (3.14.6) fto/tfe. 

Clearly F(A ,Ai) ^ F(A ,Ai) F(A ,Ai) w F(E ,-Bi)- It can be 
shown that the notion of strong similarity is stronger that the notion of 
similarity. (Problem 10.) 

Proposition 3.14.10 The matrices F(A , A\) and F(B , Bi) are strongly 
similar if and only if the pencils 



are strictly similar. 

Proof. Let [P tj ]l e C 2nx2n . Then F(0,I)P = PF(0,I) if and only if 
P\i — P22, P21 = 0. That is P = F(Pn,P 12 ) and the proposition follows. 

□ 

Clearly F(A , A x ) = F(A , B x ) A w B . Without loss of generality 
we may assume that Aq = Bq. (Sec Problem 5.) Consider all matrices 
T ,Ti satisfying (3.14.6). For A = B Q (3.14.6) reduces to 



Theorem 2.10.1 yields that the set of matrices To which satisfies the above 
conditions is of the form 



A(x) = F(0, 1) + F(A , AJx, B(x) = F(0, 1) + F(B , B,)x 



A T = T A , A T! - T x Ao = T a B 1 - A t T Q . 



(3.14.7) 



V{A 1 ,B 1 ) = {T eC{A ): 
tr(V(T B 1 - AiT )) = 0, for all V e C(A )}. 



Hence 
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Proposition 3.14.11 Suppose that F(A , A x ) = F(A , B x ) . Then 
(3.14.8) dim:P(,4i, A{) = dim V(A U = dim V(B U B{). 

As in Theorem 2.9.5 for a fixed A a ,Ax there exists a neighborhood 
D(A 1 ,p) such that the first two equalities in (3.12.7) imply that F(A , A{) = 
F{Aq,B{) for all B x e D(A u p) (Problem 4). 

We now considering a splitting result analogous to Theorem 3.5.7. 



Theorem 3.14.12 Assume that 



(3.14.9) 



A 



(0) 



4 1 ' 



o 4? 



Af G C"* xn * 



1,2, 



where A ±1 and A 22 do not have a common eigenvalue. Let 



A! 



4(1) 

^11 ^12 
^21 ^22 



Si 



n ll D \1 

R (l) R (l) 
£> 21 "22 



&e £/ie block partition ofAi,Bi as the block partition of A . Then 
(3.14.10) V(A U B 1 )=V(A$,B$)®V(AW,B& ) ). 
Moreover 

^0,^=^0,^) F(4°\4 1) )- J F(4°\SW) fori = 1,2. 



Proof. According to Problem 4 C(A ) = C(A^) © C(^^)- Thcn thc 
trace condition in (3.14.7) reduces to 



i(iM°) 



(0) R (l) 



where 



v = Vi © v 2 , r (0) = if* © T 2 (0) e C(A^) © C(4°2)- 



Choosing cither Vq — or V\ = we obtain (3.14.10). The right impli- 
cation of the last claim of the theorem is straightforward. As det To = 
det T^det T 2 (0) it follows that T e GL(n, C) lf 0) e GL(n;, C), i = 

1, 2. This establishes thc left implication of the last claim of the theorem. □ 

Thus, the classification of strong similarity classes for matrices F(Ao, A\) 
reduces to the case where A is nilpotent (Problem 6). In the case A = 
F(0,At) =■ F(0,Bi) Ai w Si. In thc case A = if n thc strong 
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similarity classes of F(H n ,A 1 ) classified completely (Problem 9). This 
case corresponds to the case discussed in Theorem 3.5.5. The case A = 
H m © H m can be classified completely using the results of Problem 2 (Prob- 
lem 3.14.11). 

Problems 

1. Prove Proposition 3.14.4. 

2. Prove Proposition 3.14.5. 

3. Let A(x) E C[x] nxn and assume that A(x)V C U C C" is a non- 
trivial invariant subspace of A(x), i.c 1 < dim U < n — 1. Let 
p(x, A) € C[x, A] be the minimal polynomial of ^4(x)|U. Thus 1 < 
deg \p(x, A) < n — 1. Show that p(x, A)|det (XI — A(x)). Hence 
det (XI — A(x)) is reducible over C[x, A]. 

4. Modify the proof of Theorem 2.9.5 to show that for a fixed A 0} A\ e 
C" x ™ there exists p > such that the first two equalities in (3.14.8) 
for B 1 e £>(Ai,p) imply that At) = F(A ,B 1 ). 

5. Show that for any P e GL(n, C) 

F(A ,A 1 )^F(B Q ,B 1 ) 4=^ F(A ,A 1 ) = F(PB Q P-\PB 1 P- 1 ). 

Assume that F(A ,A 1 ) = F(B ,B 1 ). Show that there exists P e 
GL(n,C) such that A = PB^p- 1 . 

6. Show that for any A e C 

F(A ,A 1 )^F(B ,B 1 ) F^-X/.^S J?(B - A/,Bi). 

7. Let A, e C" x ™, i = 0, s - 1. Define 



'A) A A 
A Ai 



A-2 



F(A ,...,A 



= 



o 











F(A , A s _i) and F(B 0} £? s _i) are called strongly similar 
(F(A ,...,A S _ 1 )^F(B ,...,B S _ 1 )) 

F(A , A a _i) - F(T 0) ...,T._i)F(B , B S ^)F(T , ...,T S _ 1 )- 1 , 



F(T ,...,T a _i) G GL(«n,C). 



Show that F(A , A-i) — -F(-B , -Bs-i) if and on ly if tne equal- 
ities (3.9.2) hold for k = 0, s - 1 and T € GL(n, C). 
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8. Let 



z = H n ® ... e H n , x = [x pq ]t, [r P9 K e c snxs ", 

- K ( r J ]? , ^ = [f/|f >]? G C" x ", p,q = l,..., 8 . 
Show that if each X pq is an upper triangular matrix then 

n 

detX = ]Jdet 

r=l 

(Expand the determinant of X by the rows n, 2n, ...,sn and use the 
induction.) Define 

A r = [a£}{, B r = [b%]> e C sxs , 

r+l r+1 

„« _ V r (p9) b {r) - V ?/ (p9) r - 77 1 

i=l i=l 

Using Theorem 2.8.3 show 

^(Z,X) = F(Z,F) F(i4o,...,.4„-i)Sir(S , ...,£„_!). 

9. Let X = [a; i:? -]?, F = [y^ e C" x ™. Using Problems 7-8 show that 
F(iJ„, X) = F(#„, y) if and only if 

r r 

x (n-r+i)i = 2/(n-r+i)i) f° r r = 1; n - 
»=1 i=l 

10. Let X = [ay]? e C 2x2 . Show that if a; 2 i 7^ then F(H 2 ,X) = H 4 . 
Combine this result with Problem 9 to show the existence of Y € 
C 2x2 such that F(H 2 ,X) is similar to F(H 2 ,Y) but F(H 2 ,X) is not 
strongly similar to F(H 2 ,Y). 

(3.14.11) 

Assume in Problem 8 s — 2. Let 

n— 1 n—1 

A(a;) = J] A,x l , B(ar) = ^ e iJ 2x2 . 

i=0 i=0 

Use the results of Problems 7-8, (3.9.2) and Problem 2 to show that 
F(Z,X) = F(Z,Y) if and only if the three matrices in (3.6.2) have 
the same local invariant polynomials up to degree n—1. 
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3.15 Historical remarks 

The exposition of §3.1 is close to [Gan59]. The results of §3.2 were inspired 
by [Rot81]. The notion of local indices (Problem 3.2.3) can be found in 
[FrS80]. The content of §3.3 are standard. Theorem 3.3.2 can be found 
in [Wie67] and Problem §3.3.1 in [Gan59]. The use of Cauchy integration 
formula to study the properties of analytic functions of operators and ma- 
trices as in §3.4 is now common, e.g. [Kat80] and [Kat82]. Theorem 3.4.6 
is standard. Theorem 3.4.9 is a part of the Krciss matrix stability theorem 
[Kre62]. The inequality (3.4.16) is due to [Tad81]. The results of Problem 
3.4.7 are from [Fri81]. The results of §3.5 influenced by Arnold [Arn71], in 
particular Theorem 3.5.3 is from [Arn71]. See also [Was 77]. The subject 
of §3.6 and its applications in theory of differential equations in neighbor- 
hood of singularities was emphasized in works of Wasow [Was63] , [Was77] 
and [Was78]. Theorem 3.6.4 for one complex variable appears in [Fri80b]. 
Corollary 3.6.5 is due to [Was63]. Theorem 3.7.1 for simply connected do- 
main is due to [Gin78]. See [Was 78] for the extension of Theorem 3.7.1 to 
certain domains C C p . It is shown there that Theorem 3.7.1 fails even 
for simply connected domains in C 3 . 

Theorem 3.8.1 can be found in [Kat80] or [Fri78]. The results of §3.9- 
§3.10 were taken from [Fri80b]. It is worthwhile to mention that the conjec- 
ture stated in [Fri80b] that A(x) and B(x) are analytically similar over H 
if the three matrices in (3.6.2) are equivalent over H is false [Gur81, §6]. 
The contents of §3.11 are known to the experts. The nontrivial part of this 
section (Theorem 3.11.6) is due to [DDG51]. Some of the results in §3.12, 
in particular Theorem 3.12.4, seem to be new. Property L of §3.13 was 
introduced by Motzkin-Taussky [MoT52] and [MoT55]. Theorem 3.13.4 is 
a slight improvement of [MoT55]. Our proof of property L in Theorem 
3.13.4 follows [Kat80]. Theorem 3.13.7 is taken from [M0F8O]. Theorem 
3.13.7 associates the "defective" points Ci,...,C g with the resonance states 
of molecules. Many results in §3.14 are taken from [Fri80a] and [Fri80b]. It 
connects the analytic similarity of matrices with simultaneous similarity of 
certain pairs of matrices. Simultaneous similarity of matrices is discussed 
in [Fri83]. 



Chapter 4 

Inner product spaces 



4.1 Inner product 

Definition 4.1.1 Let F = K, C and let V be a vector space over F. 
Then (■,■) : Vx V — > F is called an inner product if the following conditions 
hold: 



(a) (ax + by,z) = a(x, z) + b(y, z), for all a, 6 e F, x, y, z e V, 
(br) for F = E (y , x) = (x, y) , for all x, y G V; 
(be) for F = C (y, x) = (x~y) , for all x, y e V; 

(c) (x, x) > o for all x e V\{0}. 



Other standard properties of inner products arc mentioned in Problems 
1-2. We will use the abbreviation IPS for inner product space. In this 
chapter we assume that F = E, C unless stated otherwise. 

Proposition 4.1.2 Let V be a vector space over E. Identify Vc with 
the set of pairs (x,y), x,y e V. Then V c is a vector space over C with 

(a + \/^T6)(x,y) := a(x,y) + 6(-y,x), for all a, b £ E, x,y £ V. 

//V has a basis e 1 ,...,e n over E then (e l7 o), (e„, o) is a basis of Vc 
over C. Any inner product (-,■) on V ewer F induces the following inner 
product on Vc-' 



((x,y), (u, v)) = (x,u) + (y,v) + v/^i((y,u) - (x,v)), x,y,u,v G V. 



j|x|| := (x,x) is ca//ed f/ie norm (length) o/x G V. 
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We leave the proof of this proposition to the reader (Problem 3). 

Definition 4.1.3 Let V be an IPS. Then 

(a) x, y G V are called orthogonal if (x, y) = o. 

(b) S,TcV are called orthogonal if (x, y) = o for any x e S, y e T. 
fc) For any S C V S 1 C V is fte maximal orthogonal set to S. 

(d) x 1; ...,x m is called an orthonormal set if 

(x i ,x j ) = 6 ij , i,j = i,...,m. 

(e) Xj, ...,x„ is caZfed an orthonormal basis if it is an orthonormal set which 
is a basis in V. 

Definition 4.1.4 (Gram- Schmidt algorithm. ) LetV be an IPS and 
S = {x l7 ...,x m } C V a finite (possibly empty) set (m > 0). Then S = 
{e lr ..,e p } is the orthonormal set (p> 1) or the empty set (p = 0) obtained 
from S using the following recursive steps: 

(a) I/xj = o remove it from S. Otherwise replace x ± by ||x 1 || _1 x 1 . 

(b) Assume that x 1 ,...,Xjt is an orthonormal set and 1 < k < m. Let 
y k+1 = x fe+1 - X^ =1 (x fe+1 ,Xj)xi. If y k +i = o remove x fe+1 from S. Oth- 
erwise replace x k+1 by ||y fc+1 || _1 yfc+i . 

Corollary 4.1.5 Let V be an IPS and S = {x l7 ...,x„} C V ten 
linearly independent vectors. Then the Gram-Schmidt algorithm on S is 
given as follows: 




(4.1.1) rji := (xi,ej), j = l, l, 

i— i 

y, := x s - ^V^, := H^H, e, := — y i; i = 2, ...,n. 

In particular, e^ G 5j and ||yj|| = dist(xj, <%_!), where Si = span (x 15 ...,Xj) 
fori = 1, n and So = {0}. (See Problem 4 for the definition o/dist(xj, Si-^.) 

Corollary 4.1.6 Any (ordered) basis in a finite dimensional IPS V 
induces an orthonormal basis by the Gram-Schmidt algorithm. 

See Problem 4 for some known properties related to the above notions. 

Problems 
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1. Let V be an IPS over F. Show 

(0,x) = (x,0) =o, 

for F = M (z,ax + by) = a(z,x) + b(z,y), for all a, b G R, x, y, z G 
for F = C (z, ax + fey) = a(z,x) + 6(z, y), for all a, b G C, x, y,z G 

2. Let V be an IPS. Show 

(a) ||ox|| = \a\ ||x|| for a G F and x G V. 

(b) The Cauchy-Schwarz inequality: 

l(x,y}| < ||x|| ||y||, 

and equality holds if and only if x, y arc linearly dependent (collinear) . 

(c) The triangle inequality 

||x + y|| < ||x|| + ||y||, 
and equality holds if cither x = o or y = ax for a G K+. 

3. Prove Proposition 4.1.2. 

4. Let V be a finite dimensional IPS of dimension n. Assume that 
S c V. Show 

(a) If x lr ..,x m is an orthonormal set then x 1 ,...,x ra are linearly 
independent. 

(b) Assume that ...,e„ is an orthonormal basis in V. Show that 
for any x G V the orthonormal expansion holds 

n 

(4.1.2) x = ^(x,e,) ei . 
Furthermore for any xjG V 

n 

(4.1.3) (x,y) = ^(x,e i )(y,"e i ). 

i— l 

(c) Assume that 5 is a finite set. Let S be the set obtained by the 
Gram-Schmidt process. Show that 5 = span S = {0}. Show 
that if S ^ then e 1; e p is an orthonormal basis in span S. 
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(d) There exists an orthonormal basis e l7 ...,e„ in V and < m < n 
such that 

e 1; ...,e m G S, span S = span (e 1; ...,e m ), 
S 1 - = span (e m+1 , ...,e„), 
(S^ 1 - = span S. 

(e) Assume from here to the end of the problem that S is a subspace. 
Show V = S®S ± . 

(f) Let x G V and let x = u + v for unique u G 5, v G S . Let 
P(x) := u be the projection of x on S. Show that P : V — > V is a 
linear transformation satisfying 

P 2 = P, Range P=S, KerP=S'- L . 

(g) Show 

dist(x, S) := ||x — Px| < ||x — w|| for any w G S 
(4.1.4) and equality w = Px. 

(h) Show that dist(x, S) = ||x — w|| for some w G S if and only if 
x — w is orthogonal to S. 

5. Let X G C mx " and assume that m > n and rank X = n. Let 
x 1; ...,x„ G C m be the columns of X, i.e. X = (x 17 ...,x„). Assume 
that C m is an IPS with the standard inner product (x, y) = y*x. 
Perform the Gram-Schmidt algorithm (4.1.5) to obtain the matrix 
Q = (e 1; ...,e n ) G C mxn . Let R = (r^)? G C" x ™ be the upper trian- 
gular matrix with rji, j < i given by (4.1.1). Show that Q T Q = I n 
and X = QR. (This is the QR algorithm.) Show that if in addition 
X G R mxn then Q and R are real valued matrices. 

6. Let C G C™ x ™ and assume that {Ai,...,A„} are n eigenvalues of C 
counted with their multiplicities. View C as an operator C : C™ — > 
C". View C" as 2n-dimensional vector space over R 2n . Let C = 
A + ^IB, A,B<E R nxn . 

'A -B 
B A 

C™ — > C™ as an operator over R in suitably chosen basis. 

b. Show that {Ai, Ai, A„, A„} are the 2n eigenvalues of (7 counting 
with multiplicities. 

c. Show that the Jordan canonical form of C, is obtained by replacing 
each Jordan block XI + H in C by two Jordan blocks XI + H and 
XI + H. 



a. Then C := 



G R( 2n ) x ( 2 ™) represents the operator C : 



4.2. SPECIAL TRANSFORMATIONS IN IPS 



157 



4.2 Special transformations in IPS 

Proposition 4.2.1 Let V be an IPS and T : V — > V a linear transfor- 
mation Then there exists a unique linear transformation T* : V — > V suc/i 
i/iaf (Tx, y) = (x, T*y) /or a// x, y e V. 

See Problems 1-2. 

Definition 4.2.2 Le£ V be an IPS and let T : V — > V be a linear 
transformation. Then 

(a) T is called self-adjoint ifT* = T; 

(b) T is called anti self-adjoint ifT* = —T; 

(c) T is called unitary if T*T = TT* = I; 

(d) T is called normal if T*T = TT* . 

Denote by S(V), AS(V), U(V), N(V) the sets of self-adjoint, anti 
self-adjoint, unitary and normal operators on V respectively. 

Proposition 4.2.3 Let V be an IPS over ¥ = R,C with an orthonor- 
mal basis E = {e l7 ...,e„}. Let T : V — > V be a linear transformation. Let 
A = (dij) € F nx ™ be the representation matrix of T in the basis E: 

(4.2.1) Oij = (Tej,ei), i,j = i,...,n. 

Then for ¥ = R: 

(a) T* is represented by A T , 

(b) T is selfadjoint A = A T , 

(c) T is anti selfadjoint A=-A T , 

(d) T is unitary -^=4> A is orthogonal AA T = A T A = I, 

(e) T is normal A is normal •<=>■ ^4A T = A T j4, 

and for ¥ = C: 

(a) T* is represented by A* (:= ^4 T ), 

(6) T is selfadjoint ^4 is hcrmitian A = A*, 

(c) T is anti selfadjoint ^4 is anti hcrmitian A=— A*, 

(d) T is unitary A is unitary AA* = A* A = I, 

(e) T is normal A is normal ^=^> AA* = A* A. 



Sec Problem 3. 



158 



CHAPTER 4. INNER PRODUCT SPACES 



Proposition 4.2.4 Let V be an IPS over R, and let T £ Horn (V). 
Let V c be the complexification of V. Show that there exists a unique T c £ 
Horn (V c ) such that T C |V = T. Furthermore T is self-adjoint, unitary or 
normal if and only if T c is self-adjoint, unitary or normal respectively. 

See Problem 4 

Definition 4.2.5 For a domain D with identity 1 let 

S(n, D) := {A £ D" x " : A = A T }, 
AS(n, D) := {A £ D nxn : A = -A T }, 
0{n, D) := {A £ D nxn : AA T = A T A = I}, 
SO(n,D) :={4eO(n,D): det A = 1}, 
DO(n,D) := D(n, D) n 0(n,B), 
N(n,R) := {^4 £ R nxn : AA T = A T A}, 
N(n, C) := {i4 G C" x " : A^l* = A* A}, 
H„ := {yl G C nxn : A = A*}, 
AH n := {A £ C nxn : A = -A*}, 
U„ := {A £ C nxn : AA* = A* A = I}, 
SU„ := {A £ U„ : det A = i}, 
DU„ := D(n,C) n U„. 

See Problem 5 for relations between these classes. 

Theorem 4.2.6 Let V be an IPS overC of dimension n. Then a linear 
transformation T : V — > V is normal if and only if V has an orthonormal 
basis consiting of eigenvectors of T . 

Proof. Suppose first that V has an orthonormal basis e l7 ...,e„ such 
that T&i = Ajej, i = i,...,n. From the definition of T* it follows that 
T*e, = A.e,, i = i,...,n. Hence TT* = T*T. 

Assume now T is normal. Since C is algebraically closed T has an 
eigenvalue Ai. Let V\ be the subspace of V spanned by all eigenvectors 
of T corresponding to the eigenvalue Ai. Clearly TV 1 C V x . Let x G V x . 
Then Tx = A-lX. Thus 

T(T*x) = (TT*)x = (T*T)x = T*(Tx) = AiT*x T*V ± C V x . 

Hence TV^,T*V^ C V^. Since V = V x © it is enough to prove the 
theorem for T\V 1 and T|V^. 

As TIVj = Ai/v, h is straightforward to show T^Vi = Ai/v^ ( see 
Problem 2). Hence for T\V± the theorem trivially holds. For T|V^ the 
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theorem follows by induction. □ 
The proof of Theorem 4.2.6 yields: 

Corollary 4.2.7 Let V be an IPS over K of dimension n. Then the 
linear transformation T : V — > V with a real spectrum is normal if and 
only if V has an orthonormal basis consiting of eigenvectors of T. 

Proposition 4.2.8 Let V be an IPS over C. Let T e N(V). Then 

T is self - adjoint spec (T) C R, 

T is unitary spec (T) C S 1 = {z G C : |z| = 1}. 

Proof. Since T is normal there exists an orthonormal basis e 1; ...,e„ 
such that Tei = Ajej, i = l, ...,n. Hence T*ej = Ajej. Then 

T = T* A, = Aj, i = 1, ...,n, 

TT* = T*T = I \Xi\ = 1, i = l,...,n. 

□ 

Combine Proposition 4.2.4 and Corollary 4.2.7 with the above proposi- 
tion to deduce: 

Corollary 4.2.9 Let V be an IPS over R and let T G S(V). Then 
spec (T) C R and V ftas an orthonormal basis consisting of the eigenvectors 
ofT. 

Proposition 4.2.10 Let V be an IPS over R and let T G U(V). Then 
V = ©ie{_i 1 i l2 ,...,fe}Vi, where k > 1, Vj and Vj are orthogonal for i ^ j, 
such that 

(a) TIV-! = -/v., dim V_ x > o, 

(b) TlVi = / Vl dim V t > o, 

(c) TVi = Vj, dim V, = 2, spec (T|Vj) C ^{-i, 1} /or i = 2, fc. 
See Problem 7. 

Proposition 4.2.11 Let V &e an IPS' over R and let T G AS(V). 

Then V = ©ie{i l2 ,...,/s}Vj, w/iere fc > 1, Vj and Vj are orthogonal for 
i ^ j, such that 

(a) T\V 1 = o Vl dim V D > o, 

(b) TVj = Vj, dim V, = 2, spec (T|Vj) C \/^TR\{o} for i = 2, k. 
Sec Problem 8. 
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Theorem 4.2.12 Let V be an IPS over C of dimension n. Let T £ 
Horn (V). Let Ai,...,A„ £ C be n eigenvalues of T counted with their 
multiplicities. Then there exists a unitary basis g 17 ...,g„ of V with the 
following properties: 
(4.2.2) 

Tspan (g!,...,gi) C span (g 1; ...,g l ), (Tg^g;) = X u i= i, ...,n. 

Let V be an IPS over R of dimension n. Let T £ Horn (V) and assume 
that spec (T) C K. Let Ai,...,A„ £ R be n eigenvalues of T counted with 
their multiplicities. Then there exists an orthonormal basis g 1; ...,g„ o/V 
such that (4-2.2) holds. 

Proof. Assume first that V is IPS over C of dimension n. The proof 
is by induction on n. For n = 1 the theorem is trivial. Assume that 
n > 1. Since Ai £ spec (T) it follows that there exists g x £ V, (g^gj = i 
such that Tgi = A^. Let U := span (gi)^. Let P be the orthogonal 
projection on U. Let T x := PT\ V . Then 7\ e Horn (U). Let A 2 ,...,A„ 
be the eigenvalues of T\ counted with their multiplicities. The induction 
hypothesis yields the existence of an orthonormal basis g 2 , g„ of U such 
that 

Tispan (g 2 ,...,gi) C span (g 2 ,...,gi), (T^i, g^) = A,, i=i,...,n. 

It is straightforward to show that Tspan (g 1; ...,gi) C span (g 1; ...,gi) for 
i = 1, ...,n. Hence in the orthonormal basis g 1; ...,g„ T is presented by 
an upper diagonal matrix B = (6^)", with bu — Xi and bu = Aj, i = 
2, ...,n. Hence Ai,A 2 ,...,A„ are the eigenvalues of T counted with their 
multiplicities. This establishes the theorem in this case. The real case is 
treated similarly. □ 

Combine the above results with Problems 6 and 12 to deduce: 

Corollary 4.2.13 Let A £ C nxn . Let Ai,...,A„ £ C be n eigenvalues 
of A counted with their multiplicities. Then there exist an upper triangular 
matrix B — £ C" x ™, such that bu — Xi, i — 1, ...,n, and a unitary 

matrix U £ U„ such that A = UBIJ- 1 . If A £ N(n, C) then B is a diagonal 
matrix. 

Let A £ R nxn and assume that spec (T) C R. Then A = UBIJ- 1 where 
U can be chosen a real orthogonal matrix and B a real upper triangular 
matrix. If A £ N(n,R) and spec (A) C M then B is a diagonal matrix. 

It is easy to show that U in the above Corollary can be chosen in SU„ or 
SO(n,]R) respectively (Problem 11). 
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Definition 4.2.14 Let V be a vector space and assume that T : V — > V 
is a linear operator. Let / v e V. Then W = span (v, Tv, T 2 v, . . .) is 
called a cyclic invariant subspace of T generated by v. (It is also referred 
as a Krylov subspace of T generated by v.) Sometimes we will call W just 
a cyclic subspace, or Krylov subspace. 



Theorem 4.2.15 Let V be a finite dimensional IPS. Let T : V — > V 
be a linear operator. For O^veV let W = span (v, Tv, ...,T r_1 v) 6e a 
q/dic T-invariant subspace of dimension r generated by v. Let u 1; ...,u r 
6e an orthonormal basis of W obtained by the Gram-Schmidt process from 
the basis [v,TV, ...,T r_1 v] o/ W. T/ien (Tu^u,) = o for 1 < i < j - 2, 
i.e. the representation matrix of T|W in the basis [u 1; ...,u r ] is upper 
Hessenberg. If T is self-adjoint then the representation matrix of T|W in 
the basis [u lt . . . , u r ] is a tridiagonal hermitian matrix. 

Proof. Let Wj = span (v, . . . , r j_1 v) for j = l,...,r+ 1. Clearly 
TWj C Wj +1 for j = l,...,r. The assumption that W is T-invariant 
subspace yields W = W r = W r+1 . Since dim W = r it follows that 
v, ...,T r ~ 1 v arc linearly independent. Hence [v, . . . ,T r_1 v] is a basis for 
W. Recall that span (u 1; Uj) = Wj for j = 1, . . . , r. Let r > j > i + 2. 
Then Tu, e TW, C W i+1 . As Uj _L W i+1 it follows that (Tuj,Uj) = o. 
Assume that T* = T. Let r > i > j + 2. Then (Tu^u,) = (uj,Tuj) = o. 
Hence the representation matrix of T|W in the basis [u 17 . . . , u r ] is a tridi- 
agonal hermitian matrix. □ 



Problems 

1. Prove Proposition 4.2.1. 

2. Let P,Q e Horn (V),a,6e F. Show that (aP + bQ)* = aP* + bQ*. 

3. Prove Proposition 4.2.3. 

4. Prove Proposition 4.2.4 for finite dimensional V. (Hint: Choose an 
orthonormal basis in V.) 
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5. Show the following 



SO(n,B) c 0(n,B) c GL(n,B), 

S(n,R) C H„ c N(n,C), 

AS(rc,R) c AH„ c N(n,C), 

S(n,R), AS(n,R) C N(n,R) C N(n,C), 

0(n,R) C U n c N(n,C), 

SO(ra,B), 0(n,B), SU„, U„ arc groups 

/ n + i 

2 

H„ is an R — vector space of dimension n 2 . 
AH„ = v— i H„ 



S(n,B) is a B — module of dimension 
AS(n,D) is a D — module of dimension 



6. Let E = {e 15 ...,e„} be an orthonormal basis in IPS V over F. Let 
G — {g l7 ...,g„} be another basis in V. Show that F is an orthonor- 
mal basis if and only if the tranfer matrix cither from E to G or from 
G to E is a unitary matrix. 

7. Prove Proposition 4.2.10 

8. Prove Proposition 4.2.11 



9. a. Show that A e SO(2, R) is of the form A = 



cos v sin w 
— sin cos 9 



,0e 



b. Show that SO( 2 ,R) = e AS ( 2 < R ). That is for any B e AS( 2 , 
e s e SO(2,R) and any A e SO(n,R) is e B for some S e AS(2,1 

(Hint: Consider the power series for e B , B = 



-6 



•) 



c. Show that SO(n,R) = e AS ("' R ). (Hint: Use Propositions 4.2.10 
and 4.2.11 and part b.) 

d. Show that SO(n, R) is a path connected space. (See part e.) 

e. Let V be an n(> l)-dimensional IPS over F = R. Let p£ (n—l). 
Assume that x l7 ...,x p and y 1; ...,y p be two orthonormal systems in 
V. Show that these two o.n.s. are path connected. That is there 
are p continuous mappings Zi(t) : [o, l] — > V, i — i,...,p such that 
for each t € [0,1] z 1 (t), z p (t) is an o.n.s. and Zj(o) = Xj,Zj(i) = 
yi,i= i, 
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10. a. Show that U„ = e AH ". (Hint: Use Proposition 4.2.8 and its 
proof.) 

b. Show that U„ is path connected. 

c. Prove Problem 9e for F = C. 

11. Show 

(a) D X DD\ = D for any D e D(n,C), D x e DU„. 

(b) A e N(n, C) <^=> A = UDU* , U e SU„, D e D(n, C). 

(c) A e N(n,R), <r(A) c R ^ = [/£»Z7 T , [/ e SO„, D e 
D(n,R). 

12. Show that an upper triangular or a lower triangular matrix B £ C" x ™ 
is normal if and only if B is diagonal. (Hint: consider the equality 
(BB*)u = (B*S)n.) 

13. Let the assumptions of Theorem 4.2.15 hold. Show that instead of 
performing the Gram-Schmidt process on v, TV, T r_1 v one can 
perform the following process. Let Wj := ]p^ v - Assume that one 
already obtained i orthonormal vectors w 1; Wj. Let Wj+i := Tw— 

(Tw i; Wj)wj. If Wj + i = then stop the process, i.e. one is left 
with i orthonormal vectors. If w i+1 ^ o then w 1+1 := | Wi +1 
and continue the process. Show that the process ends after obtaining 
r orthonormal vectors w l7 . . . ,w r and = w ; for i = 1, ...,r. (This 
is a version of Lanczos tridiagonalization process.) 



4.3 Symmetric bilinear and hermitian forms 

Definition 4.3.1 Let V be a module over D and Q:VxV^l. Q 

is called a symmetric bilinear form (onV) if the following conditions are 
satisfied: 

(a) <2(x, y) = <2(y,x) for all x,y e V (symmetricity); 

(b) Q(ax + bz, y) = aQ(x, y) + bQ(z, y) for all a, b e D and x, y, z e V 
(bilinearity). 

For D = C Q is called hermitian form (onV) if Q satisfies the condi- 
tions (a') and (b) where 

(a') Q(x, y) = Q(y,x) /or a// x,y e V (bar symmetricity). 
The following results are elementary (see Problems 1-2): 
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Proposition 4.3.2 LetV be a module overll) with a basis E = {e 1; ...,e„}. 

Then there is 1 — 1 correspondence between a symmetric bilinear form Q 
on V and Ae S(n,D): 

Q(x,y) = r/ T ^, 

n n 
i= l «— l 

£ei V be a vector space over C wii/i a 6asis £ = {e l7 ...,e„}. Then there is 
1 — 1 correspondence between a hermitian form Q onV and A G H„: 

Q(x,y) = »jM6 

n n 

x = ^&e 4 , y = ^T?;e i; £ = ...,£„) T ,?7 = (r/ 1; ...,?7„) T G C™. 

i— 1 Z— 1 

Definition 4.3.3 Let the assumptions of Proposition 4-3.2 hold. Then 
A is called the representation matrix of Q in the basis E. 

Proposition 4.3.4 Let the assumptions of Proposition 4-3.2 Let F = 
{fi, f n } be another basis of the D module V. Then the symmetric bilinear 
form Q is represented by B G S(n, D) in the basis F, where B is congruent 
A: 

B = U T AU, U G GL(n,D) 

and U is the matrix corresponding to the basis change from F to E. For 
D = C the hermitian form Q is presented by B G H„ in the basis F, where 
B hermicongruent to A: 

B = U*AU, ?7GGL(n,C) 

and U is the matrix corresponding to the basis change from F to E. 

In what follows we assume that D = F = R, C. 

Proposition 4.3.5 Let V be an n dimensional vector space over K. 
Let Q:VxV^Ifca symmetric bilinear form. Let A G S(n, R) 
the representation matrix of Q with respect to a basis E in V . Let V c 
be the extension ofV over C. Then there exists a unique hermitian form 
Q c : V c x V c — ► C such that Q c |vxV = Q and Q c is presented by A with 
respect to the basis E in V c . 

See Problem 3 
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Normalization 4.3.6 Let V is a finite dimensional IPS over F. Let 
Q : V x V — > F be either a symmetric bilinear form for F = R or a 
hermitian form for F = C. Then a representation matrix A of Q is chosen 
with respect to an orthonormal basis E . 

The following proposition is straightforward (see Problem 4). 

Proposition 4.3.7 Let V is an n-dimensional IPS over F. Let Q : 
V x V — > F be either a symmetric bilinear form for F = R or a hermi- 
tian form for F = C. Then there exists a unique T G S(V) such that 
Q(x,y) = (Tx,y) for any x,y G V. In any orthonormal basis of V Q 
and T represented by the same matrix A. In particular the characteristic 
polynomial p(X) of T is called the characteristic polynomial of Q. Q has 
only real roots: 

Ai(Q) > - > A„(Q), 

which are called the eigenvalues of Q. Furthermore there exists an orthonor- 
mal basis F = {fi, f„} in V such that D = diag(Ai(Q), X n (Q)) is the 
representation matrix of Q in F. 

Vice versa, for any T G S(V) and any subspace U C V the form 
Q(T, U) defined by 

Q(T,U)(x,y):=(Tx,y) forx,yGU 

is either a symmetric bilinear form for F = R or a hermitian form for 
F = C. 

In the rest of the book we use the following normalization unless stated 
otherwise. 

Normalization 4.3.8 Let V is an n-dimensional IPS over F. Assume 
that T G S(V). Then arrange the eigenvalues of T counted with their 
multiplicities in the decreasing order 

Ai(T) > ... > A„(T). 

Same normalization applies to real symmetric matrices and complex her- 
mitian matrices. 

Problems 

1. Prove Proposition 4.3.2. 

2. Prove Proposition 4.3.4. 

3. Prove Proposition 4.3.5. 

4. Prove Proposition 4.3.7. 
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4.4 Max-min characterizations of eigenvalues 

Definition 4.4.1 Let V fc o finite dimensional space over the field F. 
Denote by Gr(m, V) be the space of all m- dimensional subspaces in U of 
dimension m € [0, n] n Z + . 

Theorem 4.4.2 (The convoy principle) Let V be an n-dimensional 
IPS. LetTeS(V). Then 

(4.4.1) A fe (T)= max min V ' / = 

max X k (Q(T, U)), fc = i,...,n, 

UeGr(fc,V) 

where the quadratic form Q{T, U) is defined in Proposition 4-3-7. For 
k € [l.tijnN iet U 6e an invariant subspace of T spanned by eigenvectors 
e^-.^efc corresponding to the eigenvalues \\{T), X k (T). TTien Afc(T) = 
A fe (<9(T,U)). Le£ U e Gr(fc,V) and assume tftat A fe (T) = X k (Q(T, U)). 
T/ien U contains and eigenvector of T corresponding to Afc(T). 
In particular 

(4.4.2) Ai(T)= max A„(T) = min /% N 



O^xGV (x, x) ' O^xGV (x,x) 

Moreover for any x^O 

A„(T) - Tx = A„(T)x, 



The quotient ^p^r , / x £ V is called Rayleigh quotient. The 
characterization (4.4.2) is called convoy principle. 

Proof. Choose an orthonormal basis E = {e l7 ...,e„} such that 

(4.4.3) Te, = \i{T)e u < e l ,e j >= S tJ i,j = i,...,n. 

Then 

(444) M-EkW x = yVe-#o 
(x,x) - EILiW 2 ' 2^ e ^°- 
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The above equality yields straightforward (4.4.2) and the equality cases in 
these characterizations. Let U £ Gr(fc, V). Then the minimal characteri- 
zation of X k (Q(T } U)) yields the equality 

(4.4.5) A fc (Q(T,U))= min ^ X '^ for any U e Gr(fc, U). 

O^xgU (x, x) 

Next there exists O/xeU such that (x, e*) = o for i = 1, fc — 1. (For 
fc = 1 this condition is void.) Hence 

(TX ' X> - E "= fe Ai(T)|a:<l - < A fc (T) A fc (T) > Afe(Q(T,U)). 



Let 

Ai(T) = ... = A ni (T) > A(T) ni+1 (T) = ... = A„ 2 (T) > ... > 
(4.4.6) A„ r _ 1+ i(T) = ... = A„ r (T) = A„(T), n = < m < ... < n r = n. 

Assume that nj-i < k < rij. Suppose that Xk(Q(T, U)) = X k (T). Then 
for x S U such that (x, e*) = o we have equality X k (Q(T, U)) = Afe(T) if 
and only if x = ^™i fe £iej. Thus Tx = Afe(T)x. 

Let U fc = span (e 1; ...,e fc ). Let 0^x = Ei=i e Then 



(Tx,x) _ Ei=i^i( T )ki 



>A fe (T)^A fc (0(T,U fc ))>A fc (T). 



< x > x > ElUN 2 

Hence A fe (Q(T, U fe )) = A fc (T). □ 



It can be shown that for fc > 1 and Ai(T) > Afe(T) there exist U e 
Gr(fc,V) such that X k (T) = A^(T, U) and U is not an invariant subspace 
of T, in particular U does not contain all e 1; ...,ek satisfying (4.4.3). (See 
Problem 1.) 

Corollary 4.4.3 Let the assumptions of Theorem 4-4-2 hold. Let 1 < 
I < n. Then 

(4.4.7) A fe (T)= max A fe (Q(T, W)), k = i,...,l 

W£Gr(«,V) 

Proof. For k < i apply Theorem 4.4.2 to X k (Q(T,W)) to deduce that 
A fe (<9(T,W)) < Afe(T). Let = span (e 15 ...,e^). Then 

X k (Q(T,V e )) = X k (T), k = i,...,l 

□ 
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Theorem 4.4.4 (Courant- Fisher principle) LetV be an n- dimensional 
IPS andTeS(V). Then 

(Tx, x) 

Afc(T) = min max — — — — , k=l,...,n. 

W€Gr(fe-i,V) O/xGW-L (x, x) 

Sec Problem 2 for the proof of the theorem and the following corollary. 

Corollary 4.4.5 Let V be an n- dimensional IPS and T e S(V). Let 

k,£ G [l,n] be integers satisfying k < I. Then 

K-i+k(T) < X k (Q(T,W)) < Afc(T), for any W e Gv(£, V). 

Theorem 4.4.6 Let V &e an n-dimensional IPS and S,T £ S(V). 

T/ien /or any + < n the inequality Aj+j_i(5 + T) < A, (5) + 

Aj(T) fto/ds. 

Proof. Let Ui_ 1; Vj_! C V be eigenspaces of S,T spanned by the 
first i — — l eigenvectors of S, T respectively. So 

(Sx,x) < A 4 (5)(x,x), (Ty,y) < A j (T)(y,y) for all xeU^ye V^. 

Note that dim = i — ljdimVj^i = j — i.. Let W = Uj_! + 

Vj_!. Then dim W = ? — i < i + j — 2. Assume that z e W -1 -. 
Then ((5 + T)z,z) = (Sz, z) + (Tz, z> < (A i (S') + A j (T))(z,z). Hence 
max 0j£zeW i <(5 ( t z T z ) ) Z;Z> < Ai(S) + Aj(T). Use Theorem 4.4.4 to deduce that 
X l+3 ^(S + T) < \i(S + T) < Xi(S) + Aj(T). □ 



Definition 4.4.7 Le£ V be an n-dimensional IPS. Fix an integer k <G 
[l,n]. Then Fk — {fi, ffe} is called an orthonormal k -frame if < fj,fj >= 
Sij for i,j = l,...,k. Denote by Fr(fc, V) the set of all orthonormal k-frames 
in V. 

Note that each F k € Fr(fc, V) induces U = span F k e Gr(k, V). Vice 
versa, any U e Gr(fc, V) induces the set Fr(fc,U) of all orthonormal k- 
frames which span U. 

Theorem 4.4.8 Let V be an n-dimensional IPS and T e S(V). Then 
for any integer fc € [l,n] 

fc fc 
5"^) = max V(Tf 4 ,f 4 ). 

f-' {f 1: ...,f fc }GFr(fc,V)^ X 

2—1 2—1 
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Furthermore 

2A i (T) = 2<rf i ,f i ) 



»=1 i=l 



/or some k-orthonormal frame F k = {f l5 ...,ffe} i/ and on/n i/ span F k is 
spanned by e 1; ...,efe satisfying (4-4-3). 

Proof. Define 

k 

trQ(T.U) :=^Ai(Q(T,U)) for U e Gr(fc, V), 

i=i 

(4.4.8) 

fc 

tr fe T:-^A 2 (T). 
»=i 

Let F k = {fi,...,f fe } e Fr(fc,V). Set U = span F k . Then in view of 
Corollary 4.4.3 

k k 
2<rf i ,f i )=trQ(T,U) <Y,\{T). 

i—l i—i 

Let Ek := {e„ ...,efc} where e 1; ...,e„ are given by (4.4.3). Clearly tr k T = 
trQ(T, span E k ). This shows the maximal characterization of tr^ T. 

Let U e Gr(fc, V) and assume that tr fe T = trQ(T,U). Hence A,(T) = 
A;(Q(T,U)) for i = l,...,k. Then there exists G k = {g l7 ...,g fe } e Fr(fc,U)) 
such that 

min - Ai(Q(T, U)) - A,(T), i - i, k. 

O^xespan (g T ,...,g<} (X, X) 

Use Theorem 4.4.2 to deduce that Tg, = A;(T)g; for i = 1, □ 



Theorem 4.4.9 Le£ V 6e an n- dimensional IPS and T G S(V). TTien 
/or any integer k, I e [1, n], snc/i i/iai k + I < n 

l+k k 

V A,(T)= min max V(Tfi,fi). 

weGr(i.v) {f 1 ,...,f fc }ePr(fe,vnw- L ) *7~i 

Proof. Let W 3 := span (e r , —,ej), j = i,...,n, where e 17 ...,e„ are 
given by (4.4.3). Then V 1 := V n Wj is an invariant subspace of T. Let 
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Ti := T\V 1 . Then Aj(Ti) = A ;+t (T) for i = 1, . . . , n - I. Theorem 4.4.8 for 
Ti yields 

k l+k 

max V(Tf i ,f i }= V Ai(T). 

{ f 1) ...,f fc }eFr( fe ,vnw-)^ 

Let T 2 := T\W l+k and W e Gr(Z, V). Set U := W ;+fe n W^. Then 
dim U > k. Apply Theorem 4.4.8 to -T 2 to deduce 

fc fc 

£>(-t 2 ) ^^{-Tfi.fO for {fi,.,yefV(^). 

t=l i=l 

The above inequality is equal to the inequality 
i+fe fc 

J] Ai(T) < ^{rfi.fj) for {fi,...,f fc } e Fr(fc,U) < 

i=l + l i=l 

fe 

max V"(Tfj,fj). 
{fi,...,f fe }eFr(fe,vnw^) 

The above inequalities yield the theorem. □ 

Problems 

1. Let V be 3 dimensional IPS and T £ Horn (V) be self-adjoint. As- 
sume that 

Ai(T) > A 2 (T) > A 3 (T), Te, - A 4 (T)e 4 , * = i, 2, 3. 
Let W = span (e l7 e 3 ). 

(a) Show that for each t £ [Xs(T), Ai(T)] there exists a unique W(t) e 
Gr(i,W) such that Ai(Q(T, W(i))) = i. 

(b) Let t £ [A 2 (T),Ai(T)]. Let U(i) = span (W(i),e 2 ) £ Gr(a,V). 
Show that A 2 (T) = A 2 (Q(T,U(i)). 

2. (a) Let the assumptions of Theorem 4.4.4 hold. Let W £ Gr(fc-i, V). 
Show that there exists 7^ x £ W 1 - such that (x, ej) = o for k + 
1, n, where e 1; e„ satisfy (4.4.3). Conclude that Ai(Q(T, W- 1 )) > 

> A fe (T). 

(b) Let Lbj = span (e^...,^). Show that Ai(Q(T,U|)) = A £+1 (T) 
for £ = 1, n — 1. 

(c) Prove Theorem 4.4.4. 
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(d) Prove Corollary 4.4.5. (Hint: Choose U £ Gr(fc,W) such that 
U C Wnspan (e n _/ +fc+1 ,...,e n )- L . Then \ n - i+k (T) < X k (Q(T, U)) < 
A fe (Q(T,W)).) 

3. Let B = [&ij]"j = i £ H„ and denote by A £ H n -i the matrix obtained 
from B by deleting the j — th row and column. 

(a) Show the Cauchy interlacing inequalities 

K(B) > \(A) > K+i(B), for i = 1, n-1. 

(b) Show that inequality Xi(B) + X n (B) < Xi(A) + b u . 

Hint. Express the traces of B and A respectively in terms of 
eigenvalues to obtain 

n-1 

Ai(B) + X n (B) = b u + Xi(A) + ^(Ai(A) - A 4 (5)). 

i=2 

Then use the Cauchy interlacing inequalities. 

Show the following generalization of Problem 3.b ([Big96, p. 56]). Let 

Bu B12 



B12 B 2 2 



B £ H„ be the following 2x2 block matrix B ■ 
Show that 

Ai(B) + X n (B) < Ai(Bn) + Ai(Ba2). 

Hint. Assume that Bx = X 1 (B)x 7 x T = (xj ,xj), partitioned as 
B. Consider U = span ((x^, 0) T , (0, xJ) T ). Analyze Ai(Q(T,U)) + 
A 2 (Q(T,U)). 

5. Let B = (bij)? £ H„. Show that B > if and only if det (6^)1 > 
for k = 1, n. 

6. Let T £ S(V). Denote by t + (T),i (T),i_(T) the number of posi- 
tive, negative and zero eigenvalues among X\(T) > ... > A„(T). The 
triple t(T) := (i+(T), i (T), t_(T)) is called the inertia of T. For 
BeH„ let (,(£?) := {l+{B),lo{B),L-{B)) be the inertia of B, where 
l+(B), io(B), L-(B) is the number of positive, negative and zero eigen- 
values of B respectively. Let U £ Gr(fc, V). Show 

(a) Assume that X k {Q{T, U)) > 0, i.e. Q(T, U) > 0. Then k < t+(T). 
If k = t + (T) then U is the unique invariant subspace of V spanned 
by the eigenvectors of T corresponding to positive eigenvalues of T. 

(b) Assume that A fc (Q(T,U)) > 0, i.e. Q(T,U) > 0. Then k < 
l+(T) + l (T). If k = l+{T) + i a (T) then U is the unique invariant 
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subspace of V spanned by the eigenvectors of T corresponding to 
nonnegative eigenvalues of T. 

(c) Assume that Ai(Q(T,U)) < o, i.e. Q(T,V) < o. Then k < t_(T). 
If k — t_ (T) then U is a unique invariant subspace of V spanned by 
the eigenvectors of T corresponding to negative eigenvalues of T. 

(d) Assume that Ai(Q(T,U)) < o, i.e. Q(T, U) < o. Then k < 
t_(T) + to(T). If k = t_(T) + t CO then U is a unique invariant 
subspace of V spanned by the eigenvectors of T corresponding to 
nonpositive eigenvalues of T. 

7. Let B e H„ and assume that A = PBP* for some P e GL(n,C). 
Then i(A) = l(B). 

4.5 Positive definite operators and matrices 

Definition 4.5.1 Let V be a finite dimensional IPS over F = C,K. 
Let S,T € S(V). Then T > S, (T > S) if (Tx,x) > (5x,x), ((Tx,x) > 
(Sx,x)) /or a// 7^ x G V. T is called positive (nonnegative) definite if 
T > (T > 0), w/iere is £/ie zero operator in Horn (V). 

Denote by S + (V)° C S+(V) C S(V) the open set of positive definite 
self adjoint operators and the closed set of nonnegative self adjoint operators 
respectively. 

Let P, Q be either quadratic forms ifW = M or hermitian forms i/F = C. 
Then Q > P, (Q > P) i/Q(x,x) > P(x,x), (Q(x,x) > P(x,x)) for all 

^ x e V. Q is ca/ied positive (nonnegative) definite if Q > (Q > 0), 
where is t/ie zero operator in Horn (V). 

For A,B e H„ B > A (B > A) ifx*Bx > xMx (x'Bx > x'ix) 
for all 7^ x G C™. Be H n is ca//ed is caZZed positive (nonnegative) 
definite if B > (B > 0). Denote by H° , C H„ ; _|_ C H„ f/ie open sef 
o/ positive definite n x n hermitian matrices and the closed set of n x n 
nonnegative hermitian matrices respectively. Let S+(n,R) := S(n,R) n 
H n ,+ , S + (n,l)»:=S(n,l)nH; + . 

Use (4.4.1) to deduce. 

Corollary 4.5.2 Let V be n-dimensional IPS. Let T E S(V). T/ien 
T > (T > 0) if and only if X n (T) > (A„(T) > 0). Let S e S(V) and 
assume tftof T > S (T > S). Then Aj(T) > A^S) (A,(T) > A, (5)) /or 

1 = 1, n. 

Proposition 4.5.3 Le£ V 6e a /inzie dimensional IPS. Assume that 
T e S(V). T/ien T > if and only if there exists S G S(V) smc/i i/iot T = 
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S 2 . Furthermore T > if and only if S is invertible. For < T £ S(V) 
there exists a unique < S € S(V) such that T = S 2 . This S is called the 
square root of T and is denoted by Ts. 

Proof. Assume first that T > 0. Let e 17 ...,e n be an orthonormal basis 
consisting of eigenvectors of T as in (4.4.3). Since Aj(T) > 0, i = 1, ...,n 
we can define P S Horn (V) as follows 

Pe, = v /A i (T)e i , i = i,...,n. 

Clearly P is self- adjoint nonnegative and T = P 2 . 

Suppose now that T = S 2 for some S g S(V). Then T e S(V) and 
(Tx,x) = (Sx,Sx) > o. Hence T > 0. Clearly (Tx,x) = o Sx = o. 

Hence T > Sg GL(V). Suppose that 5 > 0. Then \{S) = 

y/\i(T), i = l,...,n. Furthermore each eigenvector of S is an eigenvector 
of T. It is straightforward to show that S = P, where P is defined above. 
Clearly T > if and only if yJ\ n {T) > 0, i.e. if and only if S is invertible. □ 

Corollary 4.5.4 Let B e H„ (S(n,R)). Then B > if and only there 
exists A e H„ (S(n,M)) such that B = A 2 . Furthermore B > if and only 
if A is invertible. For B > there exists a unique A > such that B = A 2 . 
This A is denoted by B? . 

Theorem 4.5.5 Let V be an IPS over ¥ = C,R. Let x 1 ,...,x„ e 

V. Then the grammian matrix G(x 1 ,...,x„) := ((x^x^))™ is a hermitian 
nonnegative definite matrix. (If¥ = R then G(x 1 , ...,x„) is real symmetric 
nonnegative definite.) G(x l7 ...,x„) > o if and only x 1; ...,x n are linearly 
independent. Furthemore for any integer fee [l,n — 1] 

(4.5.1) det G(x 1; ...,x„) < det G(x 1; ...,x fe ) det G(x fe+1 , ...,x„). 

Equality holds if and only if either det G(x 1 , x^) det G(xfe +1 , ...,x„) = o 
or (xj, Xj) = o for i = 1, fc and j = fc + 1, n. 

Proof. Clearly G(x 1; x„) G H„. If V is an IPS over R then G(x 17 ...,x„) e 
S(n,R). Let a = (a 1; a„) T € F™. Then 

n n 

a*G(x 15 ...,x„)a = (^o i x i ,^OjX :7 -) > o. 

Equality holds if and only if Yl7=i a i Xi = °- Hence G(x 1 ,...,x„) > o 
and G(x 1; ...,x„) > o if and only if x l7 ...,x„ are linearly independent. In 
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particular det G(x 1 , x„) > o and det G(x 1 , x„) > o if and only if 
x l7 ...,x„ are linearly independent. 

We now prove the inequality (4.5.1). Assume first that the right-hand 
side of (4.5.1) is zero. Then cither x lr ..,Xfc or Xfc +1 ,...,x„ are linearly 
dependent. Hence x l7 ...,x„ are linearly dependent and det G = 0. 

Assume now that the right-hand side of (4.5.1) is positive. Hence 
Xi,...,Xfc and xj. +1 ,...,x n arc linearly independent. If x l7 ...,x„ are lin- 
early dependent then det G — and strict inequality holds in (4.5.1). It is 
left to show the inequality (4.5.1) and the equality case when x 17 ...,x„ are 
linearly independent. Perform the Gram-Schmidt algorithm on x. 1 ,...,x n 
as given in (4.1.1). Let Sj = span (x l7 Xj) for j = 1, ...,n. Corollary 
4.1.1 yields that span (e 1; ...^n-J = S n - 1 . Hence y„ = x n - YTjZl 
for some &i, 6„_i e F. Let G' be the matrix obtained from G(x 1 , x„) 
by subtracting from the n-th row bj times j-th row. Thus the last row of 
G' is ((y„,x 1 ),...,(y n ,x„)) = (o, o, ||y„|| 2 ). Clearly det G(x 17 ...,x„) = 
det G'. Expand det G' by the last row to deduce 

det G(x 1; ...,x„) = det G(x i; ...,y. n - x ) ||y„|| 2 = ... = 

n 

(4.5.2) detG(x 1 ,...,x fc ) J] ||y 4 || 2 = 

i — k-\-i 
n 

det G(x l5 ...,x fe ) dist(x,,5 , i _ 1 ) 2 , k = n - 1, 1. 

i— fc+i 

Perform the Gram-Schmidt process on Xfc +1 , ...,x n to obtain the orthogonal 
set of vectors y/c+i, ■ ■■,y n such that 

Sj := span (x fe+1 , ...,Xj) = span (y k +i, -.yj), distfo, Sj-i) = ||yj||, 

for j = fc+1, n, where Sfc = {0}. Use (4.5.2) to deduce that det G(xfc +1 , 
rij=fc+i llyjll 2 - As Sj-i c "Sj-i f° r j > it follows that 

||y,-|| = dist(x j ,S' j _ 1 ) < dist(x J; 4-0 = ||yj, j = k + i,...,n. 

This shows (4.5.1). Assume now equality holds in (4.5.1). Then ||yj|| = 
Hy-jll for j = k + l,...,n. Since Sj_i C and y^ — Xj e C 

Sj-! it follows that dist(xj, Sj-i) — dist(yj, Sj_i) = ||yj||. Hence ||y,-| = 
dist(yj, Sj_i). Part (h) of Problem 4.1.4 yields that yj is orthogonal on 
Sj-i. In particular each y^ is orthogonal to Sk for j = k + 1, ...,n. Hence 
xj _L Sk for j = k + 1, n, i.e. (xj,Xj) = o for j > fc and i < k. Clearly, 
if the last condition holds then 

det G(x 1; ...,x n ) = det G(x 15 ...,x fe ) det G(x fe+1 , ...,x„). □ 
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det G(x l7 x„) has the following geometric meaning. Consider a par- 
allelepiped II in V spanned by x 1; x„ starting from the origin 0. That is 
II is a convex hull spanned by the vectors and X)ies x * ^ or a ^ nonem pty 
subsets Sc{l,...,n}. Then -y/det G(x 1 , x n ) is the n- volume of II. The 
inequality (4.5.1) and equalities (4.5.2) are "obvious" from this geometrical 
point of view. 

Corollary 4.5.6 Let < B = (by)" G H„,+ . Then 

det B < det (bij]\ det (& y )£ +1 , for k = 1, ...,n - 1. 

For a fixed k equality holds if and only if either the right-hand side of the 
above inequality is zero or bij = for i = 1, k and j = k + 1, n. 

Proof. From Corollary 4.5.4 it follows that B = X 2 for some X G H„. 
Let x 1; ...,x„ G C™ be the n-columns of X T = (x lr ..,x n ). Let (x,y) = 
y*x. Since X G H„ we deduce that B — G(x 1 , ...,x„). □ 



Theorem 4.5.7 Let V be an n-dimensional IPS. Let T G S. TFAE: 

(a) T > 0. 

(b) Let g 17 ...,g„ be a basis of V. Then det ((Pgi, gj))* - =1 > o, fc = 
l, ...,n. 

Proof, (a) => (b). According to Proposition 4.5.3 T = S 2 for some 5 £ 
S(V)nGL(V). Then (T gi , gj ) - (S Si ,S Sj ). Hence det «T gi , gj »* j=1 = 
det G(S'g 1 , S'gfe). Since S 1 is invertiblc and g 1; gt linearly independent 
it follows that Sg x , Sgk are linearly independent. Theorem 4.5.1 implies 
that det G(Sg 1 , S'gfe) > o for k = 1, n. 

(b) (a). The proof is by induction on n. For n = 1 (a) is obvious. 
Assume that (a) holds for n = to— 1. Let U := span (g 1; ...,g n -^) and Q := 
Q(T,U). Then there exists P G S(U) such that < Px,y >= Q(x,y) =< 
Tx,y > for any x, y G U. By induction P > 0. Corollary 4.4.3 yields that 
A„_i(T) > A„_i(P) > 0. Hence T has at least n — 1 positive eigenvalues. 
Let e 1; ...,e„ be given by (4.4.3). Then det ((Te t , e,-))^ = EHU ^(T) > 
o. Let A = (a pq )i G GL(n, C) be the transformation matrix from the basis 
Si 5 Sn to e x , e n , i.e. 



n 

g^ ^ ^ CLpiGp^ i — 1, Tl. 
p=l 
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It is straightforward to show that 



((T Si , Sj ))^ = A T ((Te p ,e q 

(4.5.3) 



))A=> 



n 



det «T gi , gj -))? - dot «Tei, ej -))?|det A\ 2 = |det A| 2 TJ A,(T). 

2—1 

Since det ((T gi , gj ))? > o and Ai(T) > ... > A„_i(T) > it follows that 



Corollary 4.5.8 Le£ B = (6^)™ G H„. T/ien B > if and only if 
det > /or k = 1, ...,n. 

The following result is straightforward (see Problem 1): 

Proposition 4.5.9 Let V be a finite dimensional IPS over F = R, C 
wii/i the inner product (-, •). Assume that T G S(V). T/ien T > if and 
only if (x,y) := (Tx, y) is on inner product on V. Vice versa any inner 
product (•,•): V x V — > R is o/ i/ie /orm (x,y) =< Tx,y > /or a unique 
self-adjoint positive definite operator T G Horn (V). 

Example 4.5.10 £ac/i < £> G H„ induces and inner product on C™ : 
(x,y) = y*Bx. Each < B E S(n,R) induces and inner product on R n ; 
(x,y) = y T Bx. Furthermore any inner product on C™ or R™ is of the above 
form. In particular, the standard inner products on C™ and R™ are induces 
by the identity matrix I. 

Definition 4.5.11 Le£ V &e a finite dimensional IPS with the inner 
product (-,■). Le£ 5 G Horn (F). T/ien 5 is ca/fed symmetrizable if there 
exists an inner product (-, •) on V swcft iftai 5 is self-adjoint with respect 



K{T) > 0. 



□ 



to (-,.). 



Problems 



1. Show Proposition 4.5.9. 



2. Recall the Holder inequality 



n 



n 



n 



(4.5.4) 




for any x = (x 1 , . . . , x n ) T , y = (y 1 , . . . , y n ) T , a = 
and p, q G (l,oo) such that - + - = 1. Show 



(« 



. . . , a n ) G R™ 
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(a) Let A e H„ j+ ,x e C™ and < i < j < k be three integers. 
Then 

(4.5.5) x*A j x < (x*A l x)^(x*A k x)^. 

Hint: Diagonalizc A. 

(b) Assume that A = e B for some B € H„. Show that (4.5.5) holds 
for any three real numbers i < j < k. 

4.6 Majorization and applications 

Definition 4.6.1 Let 

R\:={x=( Xl ,...,x n f ER n : x, > x 2 > ... > x n }. 

For x = (x 1 , x n ) T £ R™ let x = (xi, ...,x n ) T <G be the unique 

rearrangement of the coordinates of x in a decreasing order. That is there 
exists a permutation ir on {1, n} such that Xi = Xn(i), % = 1, n. 

Let x = (x ± , x n ) T , y = (y 1 , y n ) T G M™. T/ien x is weakly ma- 
jorized by y (y weakly majorizes x), which is denoted by x ^ y, i/ 

fc k 

(4.6.1) ^^<^j/;, fc=l,...,n. 

i=l i=l 

x is majorized by y (y majorizes x), which is denoted by x -< y, if x < y 
and Ei^^i = E i= i2/i- 

Definition 4.6.2 A e M™ x " is called doubly stochastic matrix if the 
sum of each row and column of A is equal to 1. Denote by Q n C M™ x ™ the 
set of doubly stochastic matrices. Denote by ^ J n thenxn doubly stochastic 
matrix whose all entries are equal to ^, i.e. J n € M" x ™ is £/ie matrix whose 
each entry is 1. 

Definition 4.6.3 P G M™ x ™ is called a permutation matrix if each row 
and column of P a contains exactly one nonzero element which is equal to 
1. Denote by V n the set of n x n permutation matrices. 

Lemma 4.6.4 The following properties hold. 

1. A £ M" x ™ is double stochastic if and only if Al = A T 1 = 1, where 

l = (i,...,i) T e K". 

2. fii = {1}. 
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3. A,B efl n ^>tA+(l- t)B e fi„ /or eacft f £ [0, 1]. 

4. A, B e fi n => AB e fi n . 

5. p„ c n„. 

6". P„ is a group with respect to the multiplication of matrices, with I n 
the identity and P -1 = P T . 

7. AgQi, Ben m ^A®Be n t+m . 
See Problem 1. 

Theorem 4.6.5 A £ R™ xn is doubly stochastic if and only 

(4.6.2) A= a P P for some a P >0, P eV n , ^ a P = 1 - 
Pev n Pev n 

Proof. In view of properties 3 and 5 of Lemma 4.6.4 it follows that 
any A of the form (4.6.2) is doubly stochastic. We now show by induction 
on n that any A £ f2„ is of the form (4.6.2). For n = 1 the result trivially 
holds. Assume that the result holds for n = m — 1 and assume that n = m. 

Assume that A = (ojj) £ fi n . Let be the number of nonzero entries 
of A. Since each row sum of A is 1 it follows that 1(A) > n. Suppose first 
1(A) < 2n — 1. Then there exists a row i of A which has exactly one nonzero 
element, which must be 1. Hence there exists i,j £ (n) such that = 1. 
Then all other elements of A on the row i and column j are zero. Denote 
by Aij £ u^ 1-1 )^™ 1 ) the matrix obtained from A by deleting the row and 
column j. Clearly Ay £ f2„_i. Use the induction hypothesis on Ay to 
deduce (4.6.2), where ap = if the entry (i,j) of P is not 1. 

We now show by induction on 1(A) > 2n — 1 that A is of the form 
(4.6.2). Suppose that any A £ Cl n such that 1(A) < 1—1,1 > 2n is of 
the form (4.6.2). Assume that 1(A) = I. Let S G (n) x (n) be the set 
of all indices (i,j) £ (n) x (n) where dy > 0. Note #5 = > 2n. 

Consider the following system of equations in n 2 variables, which are the 
entries X = (zij)£ j=1 £ R nx ": 

n n 

Since the sum of all rows of X is equal to the sum of all columns of X 
we deduce that the above system has at most 2n — 1 linear independent 
equations. Assume furthermore the conditions Xij —0 for (i,j) £ S. Since 
we have at least 2n variables it follows that there exist X ^ nxn satisfying 
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the above conditions. Note that X has zero entry in the places where A 
has zero entry. Furthermore, X has at least one positive and one negative 
entry. Therefore the exists b, c > such that A — bX, A + cX g Cl n and 
l(A-bX),l(A + cX) < I. So A - bX, A + cX are of the form (4.6.2). As 
A = ^(A-bX) + ^(A + cX) we deduce that A is of the form (4.6.2). □ 

Theorem 4.6.6 . Let x, y g R™ . TTien x ~< y if and only if there exists 
A g fi„ swc/i iftai x = Ay. 

Proof. Assume first that x = Py for some P E V n . Then it is 
straightforward to see that x -< y. Assume that x = Ay for some A g fi„. 
Use Theorem 4.6.5 to deduce that x -< y. 

Assume now that x, y g K™ and x -< y. Since x = Fx, y = Qy for some 
P, Q g "P„, it follows that x -< y. In view of Lemma 4.6.4 it is enough to 
show that x = By some B g fi n . We prove this claim by induction on ra. 
For n = 1 this claim is trivial. Assume that if x -< y g M 1 then x = By for 
some i? g fli for all Z < m — 1. Assume that n — m and x ~< y. Suppose 
first that for some l<fc<n— lwe have the equality %i = Si=i J/i- 

Let 

Xi = (x 1 ,...,x k ) T ,y 1 = (y 1} ...,y k ) T g R fe , 
x 2 = (x k+1 ,.. .,x n ) T ,y 2 = (y k+1 , . . .,y n ) T G K""' £ - 

Then x x -<! yi,x 2 -< y 2 . Use the induction hypothesis that Xj = Biyi.i = 
l, 2 where i?i g fife, i? 2 € r2„_ fe . Hence x = (Si © B2)y and Bi® B 2 € Q n . 

It is left to consider the case where strict inequalities hold in (4.6.1) for 
k = 1, . . . ,n — 1 . We now define a finite number of vectors 

y = Zj >- z 2 = z 2 >- . . . >- z N — z N y x, 

where N > 2, such that 

1. z 4+1 = B.z, for i = 1, . . . , N - 1. 

2 - Z)»=i ^ = Z)i=i w i for some fc g (n-1), where z w = w = (w^, . . . , w n ) 

Observe first that we can not have y~\ = . . . = y n . Otherwise x = y 
and we have equalities in (4.6.1) for all k g (n), which contradicts out 
assumptions. Assume that we defined 

y = Zj >- z 2 = z 2 >-...>- z r = z r = (iti, ... , u n ) T >~ x, 

for 1 < r such that Ym=i %i < Si=i u « f° r ^ = 1, • • • , ^ — 1- Assume that 
u\ = ... = Up > u p+ i = . . . = u p+q , where u p+q > u p+9+ i if p + q < n. 
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Let C(t) = ((1 - t)I p+q + ^Jp+ q ) 8 I n -(p+ q )) for * e [0, 1] and define 
u(f) = C(t)z r . We vary t continuously from t = to t = 1. Note that 
u(i) = u(t) for all t G [0, 1]. We have two possibilities. First there exists 
exits to G (0,1] such that u(i) >- x for all t G [0, to]. Furthermore for 
w = u(t ) = (w 1 , . . . , w n ) T we have the equality 5Z i=1 Xj = X)i=i w « f° r 
some fc G (n — 1). In that case r = N — 1 and zjy = u(t ). 

Otherwise let z r+1 = u(i) = (v lt . . . ,v n ) T , where v\ = . . . = v p+q > 
Vp+q+i. Repeat this process for z r+1 and so on until we deduce the condi- 
tions 1 and 2. Sox = Patzjv = PmPn-iZat-i = Bn . . . B^y. In view of 4 
of Lemma 4.6.4 we deduce that x = Ay for some A G fl n . 

□ 

Combine Theorems 4.6.6 and 4.6.5 to deduce. 

Corollary 4.6.7 Let x,y G R n . Then x -< y if and only if 

(4.6.3) x= Y, apPyfor some dp > o, P G V n where ap = l. 

PeP„ PePn 

Furthermore, if x -< y and x 7^ Py /or a// P G V n then in (4-6.3) each 
a P < 1. 

Definition 4.6.8 Le£ 7 C K fee an interval. A function <j> : I — > R is 
ca//ed convex if for any x,y G 7 and t G [0, 1] + (1 — t)y) < i0(ir) + 
(1 — t)4>{y). <ft is called strictly convex on I if for any i,i/£ I,x ^ y and 
t G (0, 1) 4>{tx + (1 - t) y ) < t<t>{x) + (1 - 

^4 function ip : I — > R is ca//ed concave or strictly concave if the function 
—ip is convex or strictly convex respectively. 

Theorem 4.6.9 Let x = (x ± , ...,x n ) T -< y = (y x , y n ) T ■ Let <j> : 
[Vn>yi] K &e convex function. Then 

n n 

(4.6.4) $>(*i) < 5>(»i)- 

i=l i=l 

7/0 is strictly convex on [y n ,yi] and Px ^ y for all P G V n then strict 
inequality holds in (4-6.4). 

Proof. Problem 3 implies that if x = (x ± , . . . , x n ) T -< y = (y ± , . . . , y n ) 
then Xi G j/i] for i = 1, . . . , n. Use Corollary 4.6.7 and the convexity of 
(j>, see Problem 2, to deduce: 

<t>{xi)< ap^iiPy)*)' i = i,---,n. 
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Observe next that Y^i=i = Ym=i ^((-Py)*) f° r a ^ e Sum up 

the above inequalities to deduce (4.6.4). 

Assume now that is strictly convex and x ^ Py for all P G V n . Then 
Corollary 4.6.7 and strict convexity of 4> implies that at least in one the 
above i — th inequality one has strict inequality. Hence strict inequality 
holds in (4.6.4). □ 



Corollary 4.6.10 Let V be an n- dimensional IPS. LetT G S(V). De- 
note A(T) := (Ai(T),...,A„(T)) T G Let F n = {fi,...f„} G Fr(n,V). 
TTien 

((Tf 1 ,f 1 ),...,(Tf„,f„)) T ^A(T). 
Le£ : [A n (T), Ai(T)] — > M &e a convex function. Then 

n n 

^(MT)) = max ^0((T fi , fi )). 

i— 1 i— 1 

7/0 is strictly convex then Ym=i 0(^iOO) = S"=i ^({Tfi, f»)) if and only 
if fi, . . . , f n is a sei o/ n orthonormal eigenvectors of T. 

See Problem 4. 

Problems 

1. Prove Lemma 4.6.4. 

2. Let I C R be an interval and assume that : 7 — » R is convex. Let 
X! , . . . , x m G 7, m > 3. Show 

(a) Let oi, . . . , a m G [0, 1] and assume that Y^T=i ai = 1- Then 
(4.6.5) <f>(J2 OiXi) < ^{xi). 

i=l i=l 

(b) Assume in addition that </> is strictly convex, Xi ^ xj for i 7^ j 
and ai, . . . , a m > 0. Then strict inequality holds in (4.6.5). 

3. Let x, y G R n . Show that x -< y -y -< -x. 

4. Prove Corollary 4.6.10. 
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4.7 Spectral functions 

Let T C S(V), where V is an n-dimcnsional IPS over F = R, C. Denote 
A(T) := {A(T) eR^: T e T}. A function / : T R is called a spectral 
function if there exists a set D C and ft : D — ► R such that A(T) C I? 
and f(T) = h(X(T)) for each TeT. D is called a Schur set if 



x,y e R^, x -< y, yefl^xefl. 
be a Schur se 

preserving if 



Let D C M.V^ be a Schur set. A function h : D — ► R is called Schur 's order 



ft(x) < ft(y) for any x,y£ D such x -< y. 

ft is called strict Schur's order preserving if a strict inequality holds in the 
above inequality whenever x^y. ft is called strong Schur's order preserving 
if 

ft(x) < ft(y) for any x, y G D such x ^ y. 

h is called strict strong Schur's order preserving if a strict inequality holds 
in the above inequality whenever x^y. 

Note that h((x\, ...,x n )) := Y^i=i 9( x i) f° r some convex function g : 
R -> R then Corollary 4.6.10 implies that ft : -> R is Schur's order 
preserving. The results of Section 4.4 yield: 

Proposition 4.7.1 Let V be an n-dimensional IPS, D C be a 
Schur set and h : D — > i? be Schur's order preserving function. Let T e 
S(V) and assume that A(T) G £*. Tften 

ft(A(T)) = max ft(x). 

xGD,x^A(T) 

Definition 4.7.2 A set D a W 1 is called a regular set if the interior of 
D, denoted by D° C R™, is a nonempty set, and D is a subset of the closure 
D°, denoted by Cl(D°). For a regular set D a function F : D — > E is in 
the class C k (D), i.e. F has k continuous derivatives, if F £ C k (D°) and 
F and any of its derivative of order not greater than k has a continuous 
extension to D. 

Definition 4.7.3 Let V be a vector space over R. For x,y e V denote 
[x,y] := {z : z = ax+(i- a)y for all a e [o, l]}. 



A set C C V is called convex if for each x,y e V [x, y] C C. Assume that 
C C V is a nonempty convex set and let x e C. Denote by C — x the set 
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{z : z = y — x, y £ C}. Let U = span (C — x). Then the dimension 
C, denoted by dim C, is the dimension of the vector space U. C — x has 
interior as a subset of U called relative interior and denoted by ri (C — x). 
Then the relative interior of C is defined as ri (C — x) + x. 

It is straightforward to show that dim C, ri C do not depend on the 
choice of x e C. Furthermore ri C is convex. See Problem 7 or [Roc70]. 

Proposition 4.7.4 Let y = (y 1 , ...,y n ) T £ M^. Denote 

M(y) :={xel^: x x y}. 

TTien M(y) is a closed convex set. 

See Problem 1. 

Theorem 4.7.5 Let D C 6e a regular Schur set in R n . Let F £ 
C 1 (D). Then F is Schur 's order preserving if and only if 

dF dF 
(4.7.1) — — (x) > ... > — (x), for each x= (x 1; x n ) T £ D. 

OX i OX n 

If for any point x = x n ) T £ D such that Xi > x i+ i the inequality 

^r(x) > g ® F + (x) holds then F is strict Schur's order preserving. 

Proof. Assume that F £ C 1 (D) and F is Schur's order preserving. Let 
x = (x r , ...,x„) T £ D°. Hence x 1 > ... > x n . Let e { = (5^, ...,5 in ) T ,i = 
i,...,n. For i £ [l,n- 1] flZ + let x(t) := x + t(ej - e i+1 ). Then 

x(t) G for |«| < r := min ^^l^i^±i ; 

(4.7.2) 

and x(ti) -< x(t 2 ) for — r < t x < t 2 < r. 



See Problem 2. Since D° is open there exists e > such that x(t) € D° for 
t G [— e, e]. Then f{t) := _F(x(t)) is an increasing function on [— e, e]. Hence 
f(0) = |^-(x) - af^-(x) > o. This proves (4.7.1) in L>°. The continuity 
argument yields (4.7.1) in D. 

Assume now that (4.7.1) holds. Let y = (y 1 , y„) T , z = (z 1; z n ) T £ 
D and define 



y(t) := (i - t)y + tz, g(t) := - t)y + tz), for t £ [o, i]. 
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Suppose that y -< z. Then y(ii) -< y(t 2 ) for < t\ < ti < 1. Since D is 
Schur set [y,z] C D. Then 

(4-7.3) s(t )_^(_ 

See Problem 3. Hence g'(t) > 0, i.e. g(t) is a nondecreasing function 
on [0,1]. Thus F(y) = g(o) < g(i) — F(z). Assume that for any point 
x = (a?!, x n ) T £ D such that > Zj+i the inequality §£:(x) > gfr^-(x). 
Suppose that y ^ z. Then 5 '(t) > and F(y) = g(o) < g(i) = F(zj. □ 

Theorem 4.7.6 Let D C R\ be a regular Schur set in W 1 . Let F £ 
C 1 (D). If F is strong Schur's order preserving then 

OF dF 
(4.7.4) - — (x) > ... > - — (x) > o, for each x = (x 17 ...,x n ) T £ D. 

CJJj ]_ (J JL n 

Suppose that in addition to the above assumptions D is convex. If F satisfies 
the above inequalities then F is strong Schur's order preserving. If for any 
point x = (x 1 ,...,x n ) T £ D J^(x) > o and ^f-(x) > d ® F + (x) whenever 
Xi > x i+ i holds then F is strict strong Schur's order preserving. 

Proof. Assume that F is strong Schur's order preserving. Since F is 
Schur's order preserving (4.7.1) holds. Let x = (x 1; ...,x n ) T £ D°. Define 
w(t) = x + te n . Then there exists e > such that w(t) £ D° for t £ [— e, e]. 
Clearly w(ti) ^ w(t 2 ) for — e < *i < *2 < e - Hence the function h(t) := 
F(w(t)) is not decreasing on the interval [— e, e]. Thus J^f-(x) = h'(o) > o. 

Use the continuity argument to deduce that J^(x) > o for any x £ D. 

Assume that D is convex and (4.7.4) holds. Let y, z £ D and define 
y(t) and g(t) as in the proof of Theorem 4.7.5. Then 

, m y >f(yW) dF(y(t)) ' 

i=l j=l 

(4.7.5) 

See Problem 3. Assume that y ^ z. Then 5 '(t) > 0. Hence F(y) < F(z). 

Assume now that for any point x = (x t , ...,x n ) T £ D J^-(x) > o and 
^r(x) > g ® F + (x) whenever xi > x i+ \. Let y, z £ D and assume that 
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y ^ z and y^z. Define g(t) on [0, 1] as above. Use (4.7.5) to deduce that 
g'{t) > on [0, 1]. Hence F(y) < F(z). □ 

Let C be a convex set. A function / : C — > R is called convex if 

(4.7.6) /(ax+ (i - a)y) < a/(x) + (i - a)/(y), for any a e [o, l], 
and x, y <G C. 

/ is called strictly convex if for any x,y e C, x^y and a € (0, 1) the 
strict inequality holds in the above inequality. The following result is well 
known. (See Problems 4-9) 

Theorem 4.7.7 Let C C R d be a regular convex set. Assume that 
f G C 2 (C). Then f is convex if and only if the symmetric matrix H(f) := 
( dx x )i,j=i * s nonnne 9 a ti ve definite for each y E C. Furthermore, if H(f) 
is positive definite for each y € C then f is strictly convex. 

For any set T e V we let C1T be the closure of T in the standard 
topology in V (which is identified with the standard topology of R dlm sV ). 

Proposition 4.7.8 Let C C V be convex. Then C1C is convex. As- 
sume that C is a regular set and f € C°(C1C). Then f is convex in C1C 
if and only if f is convex in C. 

See Problem 8. 

Denote by 6 := t U {—00,00} the extended real line. Then a + 00 = 
oo + a = oofora€RU {00}, a — 00 = —00 + a = —00 for seRU {—00} 
and 00 — 00, —00 + 00 arc not defined. For a > we let aoo = 00a = 
00, a(— 00) = (— oo)a = —00 and Ooo = 00O = 0, 0(— 00) = (— oo)0 = 0. 
Clearly for any a e I -00 < a < 00. Let C be a convex set. Then 
/ : C — > R is called an extended convex function if (4.7.6) holds. Let 
/ : C — ► R be a convex function. Then / has the following continuity and 
differentiability properties: 

In the one dimensional case where C = (a, b) C R / is continuous on 
C and / has a derivative f'(x) at all but a countable set of points. f'(x) 
is an nondecreasing function (where defined). In particular / has left and 
right derivatives at each x, which is given as the left and the right limits of 
f'(x) (where defined). 

In the general case C C V, / is continuous function in ri C, has a 
differential Df in a dense set C\ of ri C, the complement of C\ in ri C has 
a zero measure, and Df is continuous in C\ . Furthermore at each x e ri C 
f has a subdiffcrcntial cf) e Hom(V,R) such that 

(4.7.7) /(y) > /(x) + <f>(y - x) for all y e C. 
See for example [Roc70]. 



186 



CHAPTER 4. INNER PRODUCT SPACES 



Proposition 4.7.9 (The maximal principle) Let C be a convex set and 
let fcf, : C — ► R be an extended convex function for each <fi a set $. Then 

/(x) :— sup /^(x), for each x e V, 

is an extended convex function on C. 

Theorem 4.7.10 LetY be an n- dimensional IPS over F = R, C. Then 
the function 4>i : S(V) — > R given by 

i 

(4.7.8) &(T) := 5>iCT). T e S(V), i = i, .., n, 

i=i 

is a continuous homogeneous convex function for i = 1, ...,n — 1. 4> n {T) = 
trT is a linear function on S(V). 

Proof. Clearly <j>i(aT) — a<j>i(T) for a G [0, oo). Hence 0j is a homoge- 
neous function. Since the eigenvalues of T are continuous it follows that <pi 
is a continuous function. Clearly 0„ is a linear function on the vector space 
S(V). Combine Theorem 4.4.8 with Proposition 4.7.9 to deduce that is 
convex. □ 

Corollary 4.7.11 Let V be a finite dimensional IPS over F = R, C. 
Then 

X(aA+(l-a)B) < a\(A) + (1 - a)X(B), for any A, B e S(V), a e [o,i]. 

For a e (0, 1) equality holds if an only if there exists an orthonormal basis 
[v 1 ,...,v„] in V such that 

Avi = X l (A)u i , Bvi = \i(B)vi, i = i, n. 

See Problem 10. 

Proposition 4.7.12 Let V be n- dimensional IPS over F = R, C. For 
D C R" let 

(4.7.9) A" 1 ^) := {T E S(V) : A(T) e D}. 

If D CR" is a regular convex Schur set then X^ 1 (D) is regular convex set 
in the vector space S(V). 
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Proof. The continuity of the function A : S(V) — > implies that 
A _1 (L») is aregularset in S(V). Suppose that A, B g A(L>) _1 and a g [0,1]. 
Since D is convex a\(A) + (l — a)X(B) S D. Since D is a Schur set Corollary 
4.7.11 yields that X{aA+(l-a)B) g D. Hence aA+(l-a)B g X~ 1 (D). □ 

Theorem 4.7.13 Le£ D C 6e a regular convex Schur set and let 
h : D — > R. Lei V 6e an n- dimensional IPS over F = R, C. Le£ / : 
A _1 (.D) — > R 6e i/ie spectral function given by f(A) := h(X(A)). Then the 
following are equivalent: 

(a) f is (strictly) convex on X^ 1 (D). 

(b) h is (strictly) convex and (strictly) Schur's order preserving on D. 

Proof. Choose a fixed orthonormal basis [u 1; ...,u„]. We then identify 
S(V) with H„(F). Thus we view T := A _1 (£)) is a subset of H„(F). Since 
D is a regular convex Schur set Proposition 4.7.12 yields that T is a regular 
convex set. For x = (x 1: ...,x„) T € R" let -D(x) := diag(x 1; ...,x n ). Then 
A(£>(x)) = x. Thus £>(x) g T x e D and /(-D(x)) = h(x) for x g D. 

(a) =>■ (b). Assume that / convex on T. By restricting / to D(x),x g Z) 
we deduce that ft- is convex on O. If / is strictly convex on T we deduce 
that h is strictly convex on D. 

Let xjefl and assume that x -< y. Then (4.6.3) holds. Hence 

D(x) = £ a P PZ?(y)P T . 

Clearly X(PD(y)P T ) = X(D(y)) = y. The convexity of / yields 

fc(x) = /(£>(x)) < J] a P /(PD(y)P T ) = /(£>(y)) = / l (y). 

See Problem 6. Hence /i is Schur's order preserving. If / is strictly convex 
on T then in the above inequality one has a strict inequality if x =/= y. 
Hence h is strictly Schur's order preserving. 

(b) => (a). Assume that h is convex. Then for A,B e T 

af(A) + (l-a)f(B) = ah(X(A)) + (l-a)h(X{B)) > h(aX(A) + (l-a)X(B)). 

Use Corollary 4.7.11 and the assumption that and h is Schur's order pre- 
serving to deduce the convexity of /. Suppose that h is strictly convex 
and strictly Schur's order preserving. Assume that f(aA + (1 — a)B) = 
af(A) + (l-a)f(B) for some A, B g T and a g (0, 1). Hence A(A) = A(B) 
and X(aA + (1 - a)B) = aA(A) + (1 - a)X(B). Use Corollary 4.7.11 to 
deduce that A — B. Hence / is strictly convex. □ 
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Theorem 4.7.14 Let D C be a regular convex Schur set and let 
h G C 1 (D). Let V be an n- dimensional IPS over F = E,C. Let f : 
X~ 1 (D) — > R be the spectral function given by f(A) := h(X(A)). Then the 
following are equivalent: 

(a) f is convex on X^ 1 (D) and f(A) < f(B) for any A,B G X~ 1 (D) such 
that A < B. 

(b) h is convex and strongly Schur's order preserving on D. 

Proof. We repeat the proof of Theorem 4.7.13 with the following mod- 
ifications. 

(b) (a). Since h is convex and Schur's order preserving Theorem 4.7.13 
yields that / is convex on T. Let A,B G T and assume that A < B. 
Then X(A) ^ X(B). As h is strongly Schur's order preserving h(X(A)) < 
h(\(B))=>f(A)<f(B). 

(a) (b). Since / is convex on T Theorem 4.7.13 implies that h is 
convex and Schur's order preserving. Since h G C 1 (D) Theorem 4.7.5 
yields that h satisfies the inequalities (4.7.1). Let x G D° and define 
x(£) := x + te n . Then for a small a > x(t) G D° for t G (—a, a). Clearly 
•D(xOi)) < D(x.(t a )) for h < t 2 . Hence g(t) := f(D(x(t)) = h(x(t)) is a 
nondecreasing function on (—a, a). Hence J^(x) = g'(o) > o. Use the con- 
tinuity hypothesis to deduce that h satisfies (4.7.4). Theorem 4.7.6 yields 
that h is strong Schur's order preserving. □ 

Theorem 4.7.15 Let V be an N- dimensional IPS over ¥ = R, C. For 
n, N G N and n < N let A (n) : S(V) -> 6e tfte map A A (n) (A) := 
(Ai(A), A„(A)) T . Assume that D C is a regular convex Schur set 
and let T := Xz\(D) C S(V). Let f : T — > R 6e £/ie spectral function 
given by f(A) := /i(A(„)(A)). Assume that n < N Then the following are 
equivalent: 

(a) f is convex on T. 

(b) h is convex and strongly Schur's order preserving on D. 

(c) f is convex on T and f(A) < f(B) for any A,B G T such that A< B. 

Proof. Let ir : — > R^ be the projection on the first n coordinates. 
Let D\ := 7r _1 (_D) C K^. It is straightforward to show that D\ is a regular 
convex set. Let hi := h o 7r : Di -» R. Then H 1 = for i = n + 1, ...,JV. 
(a) ^> (b). Suppose that / is convex on T. Then Theorem 4.7.13 yields 
that hi is convex and Schur's order preserving. Theorem 4.7.5 yields the 
inequalities (4.7.1). Hence f^(y) > gf^Cy) = o for any y G D x . Clearly 
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h is convex and 
dh dh 1 

— (x) = -Q^ty), i = i, -,n, where x e D, y E D ± , and 7r(y) = x. 

Thus h satisfies (4.7.4). Theorem 4.7.6 yields that h is strongly Schur's 
order preserving. 

Other nontrivial implications follow as in the proof of Theorem 4.7.14. 
Problems 

1. Show Proposition 4.7.4 

2. Let x = (x ± , x n ) E R n and assume x\ > ... > x n . Let x(t) be 
defined as in the proof of Theorem 4.7.5. Prove (4.7.2). 

3. Let D C K" be a regular set and assume that [y,z] C D,y = 
(y ± , y n ) T ,z = z n ) T . Let F G C 1 (D) and assume that g(t) is 
defined as in the proof of Theorem 4.7.5. Show the equality (4.7.5). 
Suppose furthermore that Ym=i Vi = Ym=i Zi - Show the equality 
(4.7.3). 

4. (a) Let / e C x (a, b). Show that / is convex on (a, b) if and only if 
f'(x) is nondecreasing on (a, b). Show that if f'(x) is increasing on 
(a, b) then / is strictly convex on (a, b). 

(b) Let / € C[a, b] n C 1 (a, b). Show that / is convex in [a, b] if and 
only if / is convex in (a, b). Show that if f'(x) is increasing on (a, 6) 
then / is strictly convex on [a, b] . 

(c) Let / e C 2 (a, b). Show that / is convex on (a, b) if and only if /" 
is a nonnegative function on (a, b). Show that if f"(x) > for each 
x e (a, b) then / is strictly convex on (a, 6). 

(d) Prove Theorem 4.7.7. 

5. (This problem offers an alternative proof of Theorem 4-6.9.) Let a < b 
and neN. Denote 

[a, b]\ := {(xi, ...,ar„) eK" : e [a,b], i = 1, ...,n}. 

(a) Show that [a, 6]^ is a regular convex Schur domain. 

(b) Let / e C x [a, b] be a convex function. Let F : [a,b] n —> R be 
defined by F((x\, ...,x n ) T ) := Y^i=i f( x i)- Show that F satisfies the 
condition (4.7.1) on [a, b]V^. Hence Theorem 4.6.9 holds for any x,y E 
[a, b] n such that x ~< y. (c) Assume that any convex / E C[a, b] can be 
uniformly approximated by as sequence of convex f k E C 1 [a, b],k = 
1, ... Show that Theorem 4.6.9 holds for x, y E [a, b] n such that x -< y. 
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6. Let DcVbe convex set and assume that / : D — > R be a convex 
function. Show that for any k > 3 

f(^2 u j) ^ X^ a ^( u -?)' for any Ui > ■••' Ufe e D > ai - °' - °' 

7. Let V be a finite dimensional space and C C V a nonempty convex 
set. Let x <E C. Show 

a. The subspace U := span (C — x) does not depend on x e C. 

b. C — x has a nonempty convex interior in U and the definition of 
ri C does not depend on x e C. 

8. Prove Proposition 4.7.8. 

9. Prove Proposition 4.7.9. 

10. Use Theorem 4.4.8 to show the equality case in Corollary 4.7.11 

11. For p e [l,oo) let 

n 

||x||p lW := (^WilBjHp, x = (ar 1 ,...,a; n ) T G K™, w = ( Wl , ...,w n ) T E ffi 
i=i 

(a) Show that || • || PjW : R™ — > M is a homogeneous convex function. 
Furthermore this function is strictly convex if and only if p > 1 and 
u>i > for i = 1, ...,n. (Hint: First prove the case w = (l, i).) 

(b) For q > 1 show that || • ||| : R™ — > E is a convex function. 
Furthermore this function is strictly convex if and only if Wi > for 
i = 1, ...,n. (Hint: Use the fact that /(a;) = is strictly convex on 
[0,oo).) 

(c) Show that for q > the function || • ||| w : K" ^ — > R is strong 
Schur's order preserving if and only if w\ > ... > w n > 0. Further- 
more this function is strictly strong Schur's order preserving if and 
only if wi > ... > w n > 0. 

(d) Let V be an n-dimensional IPS over F = R, C. Show that for 
Q > 1) wi > ... > w n > the spectral function T — > ||A(T)||| W is a 
convex function on S(V) + (the positive self-adjoint operators on V.) 
If in addition w n > and max(p, q) > 1 then the above function is 
strictly convex on S(V) + . 

12. Use the differentiability properties of convex function to show that 
Theorems 4.7.14 and 4.7.15 holds under the lesser assumption h G 
C(D). 



4.8. INEQUALITIES FOR TRACES 



191 



13. Show that on H° + the function logdet A is a strictly concave func- 
tion, i.e. dct (aA+(l-a)B) > (det A) a (det B) 1 ^". (Hint: Observe 
that — logx is a strictly convex function on (0,oo).) 

4.8 Inequalities for traces 

Let V be a finite dimensional IPS over F = R, C. Let T : V -> V be a 
linear operator. Then tr T is the trace of the representation matrix A of 
with respect to any orthonormal basis of V. See Problem 1. 

Theorem 4.8.1 Let V be an n- dimensional IPS over F = R,C. As- 
sume that S,T € S(V). Then tr ST is bounded below and above by 

n n 

(4.8.1) ^Ai(5)A„_ i+1 (T) <tvST<J2*i(S)Xi(T). 
i=i i=i 

Equality for the upper bound holds if and only if ST = TS and there exists 
an orthonormal basis x 15 ...,x„ e V such that 

(4.8.2) Sx t = A 4 (S)x 4 , Tx 4 = A,(T)x 4 , i = i, n. 

Equality for the lower bound holds if and only if ST = TS and there exists 
an orthonormal basis x 1; ...,x„ e V such that 

(4.8.3) Sxi = Xi(S)x h Tx, = A„_ J+1 (T)x 2 , i=i,...,n. 
Proof. Let y 1; ...,y„ be an orthonormal basis of V such that 

Ty i = A j (T)y i) i=i,...,n, 

Ai(T) = ... = X h (T) > A il+1 (T) = ... = A 42 (T) > ... > 
A Jfc _ 1+1 (T) = ... = X lk (T) = X n (T), 1 < h < - < ifc = n. 

If A; = 1 ii = n it follows that T — Xil and the theorem is trivial in 

this case. Assume that k > 1. Then 

tr5T=^A i (T)(5y i ,y i ) = 

i=l 

n—1 i n 

^(Ai(T) - X i+1 (T))C£(Syi,yi)) + X n (T)C£(Syi,yi)) = 

i=l i=l l=i 

fe-1 ij 

^(A l3 (T) - A 4 . +1 (T)) ]T(Sy ; , y,) + X n (T) tr 5. 

3=1 1=1 
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Theorem 4.4.8 yields that Y^LiiSyuYl) < E;i 1 M< S ')- Substitute these 
inequalities for j = 1, k — 1 in the above identity to deduce the upper 
bound in (4.8.1). Clearly the condition (4.8.2) implies that tr ST is equal to 
the upper bound in (4.8.1). Assume now that tr ST is equal to the upper 
bound in (4.8.1). Then ElLi(Sy,,yz> = M^) for 3 = h-,k-l. 

Theorem 4.4.8 yields that span (y 1; ...,yj.) is spanned by some ij eigenvec- 
tors of S corresponding to the first ij eigenvalues of S for j — 1, k — 1. 
Let x 1) ...,Xj i be an orthonormal basis of span {yi, ■»,yi x ) consisting of 
the eigenvectors of S corresponding to the eigenvalues of X\(S), Xi 1 (S). 
Since any ^ x e span (yi, ■■■,yi 1 ) is an eigenvector of T correspond- 
ing to the eigenvalue A^(T) it follows that (4.8.2) holds for i = 
Consider span (y 1; ...,y i= ). The above arguments imply that this subspace 
contains i 2 eigenvectors of S and T corresponding to the first i 2 eigenvalues 
of S and T. Hence U 2 , the orthogonal complement of span (x 1 ,...,Xi 1 ) 
in span (y l7 ...,yi a ), spanned by Xj l+1 , ...,Xj 2 , which arc %2 — i\ orthonor- 
mal eigenvectors of S corresponding to the eigenvalues Xi 1 + (S), ...,X i2 (S). 
Since any nonzero vector in U 2 is an eigenvector of T corresponding to the 
eigenvalue Aj 2 (T) we deduce that (4.8.2) holds for i = l,...,i 2 . Continuing 
in the same manner we obtain (4.8.2). 

To prove the equality case in the lower bound consider the equality in 
the upper bound for tvS(-T). □ 

Corollary 4.8.2 Let V be an n-dimensional IPS over F = R, C. As- 
sume that S,T e S(V). Then 

n 

(4.8.4) $>*( 5 ) - X ^ ^ tr (^ - T ) 2 - 

i=l 

Equality holds if and only if ST = TS and V has an orthonormal basis 
x lr ..,x n satisfying (4-8.2). 

Proof. Note 

n n 

£(^(5) - K(T)) 2 = tr S 2 + tr T 2 - 2 £ A, (5) A 4 (T) . 

i=l i=l 

□ 



Corollary 4.8.3 Let S,T g H„. T/ien t/ie inequalities (4-8.1) and 
(4-8-4) hold. Equalities in the upper bounds hold if and only if there exists 
U g U„ suc/i i/iat S = C/diagA(S , )L r *,r = J/diag X(T)U* . Equality in 
the lower bound of (4-8.1) if and only if there exists V g U„ smc/i i/iai 
S = Vdi&gX{S)V*,-T = VdmgX(-T)V*. 
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Problems 

1. Let V be a n-dimensional IPS over F = R, C. 

(a) Assume that T : V — > V be a linear transformation. Show that 
for any o.n. basis x l7 ...,x„ 

n 

trT = ]T(Tx J ,x J ). 

i=l 

Furthermore, if F = C then tr T is the sum of the n eigenvalues 
of T. 

(b) Let S,T G S(V). Show that trST = tr TS € R. 

4.9 Singular Value Decomposition 

Let U, V, be finite dimensional IPS over F = R, C, with the inner products 
(y)u,(y)v respectively. Let u l7 ...,u m and v 1; ...,v n be bases in U and 
V respectively. Let T : V — > U be a linear operator. In these bases T is 
represented by a matrix A e F mx ™ as given by (1.10.2). Let T* : U* = 
U -» V* = V. Then T*T : V -» V and TT* : U -» U arc selfadjoint 
operators. As 

(T*Tv, v) v - (TV, Tv) v > o, (TT*u, u) v = (T*u, T*u)u > o 
it follows that T*T > 0,TT* > 0. Let 

(4.9.1) T*T Ci = \i(T*T) Ci , (ci,c k )v = Sik, i,k=i,...,n, 
Xi{T*T) > ... > A„(T*T) > 0, 

(4.9.2) TT*dj - \j(TT*)dj, (d^d^u = J,i= i,...,m, 
Xi(TT*)>... >X m (TT*)>0, 

Proposition 4.9.1 Let U, V, be finite dimensional IPS over F = R, C. 
Le£ T : V -> U. T/ien rank T = rank T* = rank T*T = rank TT* = r. 
Furthermore the selfadjoint nonnegative definite operators T*T and TT* 
have exactly r positive eigenvalues, and 

(4.9.3) Aj(T*T) = Xi(TT*) > 0, i = 1, rank T. 

Moreover for i G [l,r] Tc^ andT*di are eigenvectors ofTT* andT*T cor- 
responding to the eigenvalue Xi{TT*) = A,(T*T) respectively. Furthermore 
ifc 1 ,...,c r satisfy (4-9.1) then d; := p^yy,* = 1, ■■•>»" sotis/y (^.9.^ for 
z=l, r. Similar result holds for d l7 d,.. 
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Proof. Clearly Tx = o -<f=> (Tx,Tx) = o ■<=> T*Tx = o. Hence 

rank T*T = rank T = rank T* = rank TT* = r. 

Thus T*T and TT* have exactly r positive eigenvalues. Let i g [1, r]. Then 
T*Tci ± o. Hence Tc, ^ o. (4.9.1) yields that TT*(Tc,) = A 4 (T*T)(Tc;). 
Similarly T*T(T*di) = A;(TT*)(T*dj) ^ o. Hence (4.9.3) holds. Assume 
that d,...,c r satisfy (4.9.1). Let di,...,d r be defined as above. By the 
definition ||dj|| = l,i = 1, r. Let 1 < i < j < r. Then 

= ( Ci , Cj ) = Xi(T*T)(ci, Cj) = (T*T Ci , Cj ) = (T Ci ,T Cj ) => (dj.d,-) = o. 
Hence di, d r is an orthonormal system. □ 
Let 

Ui{T) = VA~(T*T) for i = 1, ...r, a 4 (T) = for i > r, 

(4.9.4) 

<7 (p) (T) :=(a 1 (T),...,a p (T)) T e]R^, p e N. 

Then <7j(T) = (T,j(T*),2 = 1, min(m, n) are called the singular values of 
T and T* respectively. Note that the singular values are arranged in a 
decreasing order. The positive singular values are called principal singular 
values of T and T* respectively. Note that 

||T Cl || 2 = (Tci.Tci) = (T*T Cl , Cl ) = \(T*T) = a\ => 
||Tcj|| = <7j, i = i, ...,n, 

||T*dj|| 2 = (T*dj,T*dj) = (TT*dj,di) = Xi(TT*) = a] => 

Let c i,...c n be an orthonormal basis of V satisfying (4.9.1). Choose an 
orthonormal basis d 1; ...,d m as follows. Set di := = i, Then 

complete the orthonormal set {d 1; ...,d r } to an orthonormal basis of U. 
Since span (d 17 d r ) is spanned by all eigenvectors of TT* corresponding 
to nonzero eigenvalues of TT* it follows that kerT* = span (d r+1 , d m ). 
Hence (4.9.2) holds. In these orthonormal bases of U and V the operators 
T and T* represented quite simply: 

Tci — ai(T)di, i = l, n, where d^ = o for i > m, 

(4.9.5) 

T*dj = <jj(T)cj, j = i, m, where Cj = o for j > n.. 
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Let 

(4.9.6) £ = (sjj)™:" 1; Sij = for i ^ j, Sj, = a, for i = 1, min(m.n). 

In the case m^nwe call £ a diagonal matrix with the diagonal a\, cr m i n ( m . rl ) 
Then in the bases [d 1; d m ] and [c 1; c„] T and T* represented by the 
matrices £ and £ T respectively. 

Lemma 4.9.2 Let [u lt u m ], [v 15 v„] be orthonormal bases in the 
vector spaces U, V over F = R, C respectively. Then T and T* are presented 
by the matrices A £ F mx ™ and A* e F" xm respectively. Let U £ U(m) 
and V € U(n) be the unitary matrices representing the change of base 
[di,...,d m ] to [u 15 ...,u m ] and [c 15 ...,c„] to[v 15 ...,v„] respectively. (If¥ = 
K then U and V are orthogonal matrices.) Then 

(4.9.7) A= UY.V* £ F mx ", 17 e U(m), V e U(n). 

Proof. By the definition 7Vj = J27L ± a ij u i- Let [/ = (uj p )- r ^ =1 ,y = 
Then 

n n m n m m 

T Cq = ^ VjgTVj = Y v 3q 0-ijUi = Y V 39 H ^ H ^P d P- 

j'=i j = \ i=i j = i i=i p=i 

Use the first equality of (4.9.5) to deduce that U*AV = £. □ 

Definition 4.9.3 (4-9.7) is called the singular value decomposition (SVD) 
of A. 

Proposition 4.9.4 Let F = R,C and denote by1l m ^ k {¥) C F mx ™ the 
set of all matrices of rank k £ [1, min(m, n)] at most. Then A £ 1Z m ,n,k(^) 
if and only if A can be expressed as a sum of at most k matrices of rank 
1. Furthermore 7Z m ,n,k{^) is a variety in F mx " given by the polynomial 
conditions: Each (k + 1) X (k + 1) minor of A is equal to zero. 

For the proof see Problem 2 

Definition 4.9.5 Let A £ C mx ™ and assume that A has the SVD given 
by (4.9.7), where U = [u 15 . . . , u m ], V = [v Ir ..,v n ]. Denote by A k := 
E*li o-iUiV* £ C mx " for k — 1, . . . ,rank A. For k > rank A we define 
A-k A (= A ran k a) ■ 

Note that for 1 < k < rank A, the matrix A k is uniquely defined if and 
only if a k > Ofc+i- (See Problem 1.) 
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Theorem 4.9.6 For ¥ = E, C and A = (a tJ ) e F mxn the following 
conditions hold: 



(4.9.8) \\A\\ F := Vti A* A = VtiAA* 



rank A 



N E ^ A ) 2 - 
\ »=i 



(4.9.9) |U|| 2 := max MAxIL = a A A). 

xeF",||x|| a =i 

(4.9.10) 

min \\A - B\\ 2 = \\A - A k \\ = a k+ i(A), k = l, ...,rank A - 1. 

seK m ,„, fc (F) 



o-i(A) > o-i((a ipjq )™J™ q=1 ) > £7 i+(m _ m /) +( „_ n /)(^), 
(4.9.11) 

m' e [l,m], n' e [l,n], 1 < h < ... < i m > < m, 1 < ji < ... < j n > < n. 

Proof. The proof of (4.9.8) is left a Problem 7. We now show the 
equality in (4.9.9). View A as an operator A : C" — > C m . From the 
definition of \\A\\ 2 it follows 

IWl2 = ^ B ^ = Ai(^)=o 1 (^ ) 

which proves (4.9.9). 

We now prove (4.9.10). In the SVD decomposition of A (4.9.7) assume 
that U = (Ui,...,u m ) and V = (v 15 v„). Then (4.9.7) is equivalent to 
the following representation of A: 
(4.9.12) 

r 

A = ^2a l u t v*, u 1; ...,u r e R m , Vj, v r e R n , u*Uj = v*Vj = S tj , i,j = 

i=l 

where r = rank A. Let _B = Yli=i a i u i v i € H m ,n,k- Then in view of (4.9.9) 

r 

11-4- B ll2 = II E' 74 "^* H 2 = <Jk + 1 - 

fc+1 

Let i? e IZm^n.k- To show (4.9.10) it is enough to show that ||A — _B|| 2 > 
(Tfe + i. Let 

W:={xeR": Bx = o}. 
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Then codim W > k. Furthermore 

11-4— -Bill > max \\(A-B)x\\ 2 = max x* A* Ax > X k+1 (A* A) = 
||x|| a =i,xew ||x|| a =i,xew 

where the last inequality follows from the min-max characterization of 
\ k+1 (A*A). 

Let C = (aij^TqZi- Then C*C is an a principal submatrix of A* A of 
dimension n' . The interlacing inequalities between the eigenvalues of A* A 
and C*C yields (4.9.11) for mf = m. Let D = (a ipjv )™'£v Then DD* is a 
principle submatrix of CC* . Use the interlacing properties of the eigenval- 
ues of CC* and DD* to deduce (4.9.11). □ 

We now restate the above results for linear operators. 

Definition 4.9.7 LetXJ, V be finite dimensional vector spaces overF = 
R,C. For k G Z+ denote L fe (V,U) := {T G L(V,U) : rank T < k}. 
Assume furthermore that U,V are IPS. Let T G L(V, U) and assume 
that the orthonormal bases of [d ± , . . . , d TO ], [c 15 . . . , c n ] o/U,V respectively 
satisfy (4-9.5). Define Tq := and := T for an integer k > rank T. 
Let k G [l,rank T - 1] n N. Define T k G L(V,U) by the equality T fc (v) = 
£?=! ^(TXv.Ci)^ /or any v G V. 

It is straightforward to show that T k G Lfe(V, U) and T k is unique if 
and only if a k (T) > a k+1 (T). Sec Problem 8. Theorem 4.9.6 yields: 

Corollary 4.9.8 Let U and V be finite dimensional IPS over F = R, C. 
Let T : V — > U &e a linear operator. Then 



(4.9.13) ||T|| F := VtrT*T = V tr TT* 



1 



rank T 



(4.9.14) ||T|| 2 := max ||Tx|| a = a,(T). 

xev,||x|| a =i 



(4.9.15) min ||T - Q|| 2 = a k +i(T), k = 1, rank T — 1. 

QeL fc (v,u) 



Problems 

1. Let U,V be finite dimensional inner product spaces. Assume that 
T G L(U,V). Show that for any complex number t G C er.;(tT) = 
\t\(Ti(T) for all i. 
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2. Prove Proposition 4.9.4. (Use SVD to prove the nontrivial part of the 
Proposition.) 

3. Let A e C mx " and assume that U e U(m),V € V(m). Show that 
a,(C/AT/) = o-,(A) for all i. 

4. Let A e GL(n,C). Show that a^A' 1 ) = (T„(^4) _1 . 

5. Let U, V be IPS inner product space of dimensions m and n respec- 
tively. Assume that 

U = Ui U 2 ,dim Uj = dim U 2 = m 2 , 
V = Vi © V 2 ,dim Vj = ra^dim V 2 = n 2 . 

Assume that T e L(V, U). Suppose furthermore that TV X C U 1 ,TV 2 C 
U 2 . Let Ti e L(Vi,Ui) be the restriction of T to V 4 for i = 1,2. 
Then rank T = rank Ti + rank T 2 and {oi(T), cr rank T (T)} = 
i ^rank Ti (Ti)}U{ai(T 2 ),... i ^rank T2 

6. Let the assumptions of the Definition 4.9.5 hold. Show that for 1 < 
k < rank A is uniquely defined if and only if o\. > a^+i- 

7. Prove the equalities in (4.9.8). 

8. Let the assumptions of Definition 4.9.7 hold. Show that for k G 
[l,rank T — l]nN rank Tk = k and Tk is unique if and only if <Tk{T) > 

<Tk+l(T). 

9. Let V be an n-dimensional IPS. Assume that T e L(V) is a normal 
operator. Let Ai(T), . . . , A„(T) be the eigenvalues of T arranged in 
the order |Ai(T)| > ... > |A„(T)|. Show that a^T) = \Xi{T)\ for 
i = 1, . . . ,n. 



4.10 Characterizations of singular values 

Theorem 4.10.1 Let F = E, C and assume that A e F mx ". Define 



(4.10.1) H(A) 
Then 



A 

A* 



e H m+ „(F). 



Xi(H(A)) = (Ti(A), \ m+n+1 - t (H(A)) = -<Ti(A), i = l, rank A, 
(4.10.2) 

Xj(H(A)) = 0, j = rank A + 1, ...,n + m - rank A. 
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View A as an operator A : F n — > F m . Choose ovthonovfnal bases [d x , .. d m ] , \c ± , .. 
zn F m ,F n respectively satisfying (4-9.5). Then 



' A 






= cr,(A) 


"di" 




' A~ 




d," 


= -o-i(A) 


' d, " 


A* 










A* 









(4.1023^ 1,..., rank A, 
ker H(A) = span ((d* +1 ,o)*, (d„, o)*, (o, c* +1 )*, (o,<)*), r = rank A 

Proof. It is straightforward to show the equalities (4.10.3). Since all 
the eigenvectors appearing in (4.10.3) are linearly independent we deduce 
(4.10.2). □ 



Corollary 4.10.2 Let ¥ = R, C and assume that A G F mx ™. Let A := 
A[a,/3] G W xq be a submatrix of A, formed by the set of rows and columns 
a € Q p ,mif3 € Qq.n respectively. Then 

(4.10.4) o-i{A) <o-i{A) fori = l,.... 

For I G [1, rank A] n N the equalities o~i(A) = o~i(A), i = 1, . . . , I hold if and 
only if there exists two orthonormal systems of I right and left singular vec- 
tors d, . . . ,c; € F n , d 1; . . . ,dj e F" satisfying (4-10.3) for i — 1, . . . , I such 
that the nonzero coordinates vectors c l7 . . . , Cf and d 1; . . . , d; are located at 
the indices (3, a respectively. 

See Problem 1. 

Corollary 4.10.3 Let V,U be IPS over F = R,C. Assume that W 
is a subspace of V. Le£ T G L(V,U) and denote by T G L(W,U) £/ie 
restriction of T to W. T/ien <7j(T) < o"j(T) /or any ieH. Furthermore 
CTi(T) = ai(T) for i — 1, ... ,1 < rank T if and only ifXJ contains a subspace 
spanned by the first I right singular vectors of T. 

See Problem 2. 

Define by R™ ^ := n R™ . Then L> C is called a strong Schur 

set if for any xje R™ ^ y wc have the implication yefl^xefl. 

Theorem 4.10.4 Let p G N and D C n be a regular convex 
strong Schur domain. Fix to, n G N and let ov p )(_D) := {A G F mx ™ : 
(T(p)(j4) G £>}. Let h : D —> R be a convex and strongly Schur 's order 
preserving on D. Let f : :— > R 6e gwen as hoa^ p y Then f is a convex 
function. 

Sec Problem 3. 
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Corollary 4.10.5 Let F = E,C, m,n,p e N, q e [l,oo) and w\ > 
w 2 > ... > w p > 0. T/ien i/ie following function 

p 

f : F mx " -» E, /(A) := ^^(A) 9 ) 5 , A e F mx " 

i=l 

is a convex function. 
See Problem 4 

We now translate Theorem 4.10.1 to the operator setting. 

Lemma 4.10.6 Let U,V be finite dimensional IPS spaces with the in- 
ner products (•, -)u, (•, -)v respectively. Define W := V © U be the induced 
IPS with 

((y,x), (v,u)) w == (y,v) v + (x,u)u. 
Let T : V — > U &e a linear operator, and T* : U — > V &e i/ie adjoint of T . 
Define the operator 

(4.10.5) f:W->W, T(y,x):=(T*x,Ty). 

TTien T is self-adjoint operator and f 2 = T*T ®TT* . Hence the spectrum 
of T is symmetric with respect to the origin and T has exactly 2rank T 
nonzero eigenvalues. More precisely, if dim U = m, dim V = n then: 

(4.10.6^(r) = -A m+ „_ i+1 (f ) = (Ti(T), for i = l,.. . ,rank T, 
Xj(T) — 0, /or j = rank T + l,...,n + m — rank T. 

Lei{d 1; . . . ,d min(m , n) } e Pr(min(m,n),U),{c 1; . . . ,c min(m , n) } e Fr(min(m, n), V) 

6e i/ie set of vectors satisfying (4-9.5). Define 

(4.10.7) 

Zj := -^(cijdi)^^^^! := y^( c i> - d i),« = i, • • • ,min(m, n). 

Then{z 1 ,z m+n , . . . ,z min(TOi „),z m+ „_ min(TOi „ )+1 } e Fr(2 min(m, n), W) . F«r- 
thermorefz l = <7i(T)zj, Tz TO+ „_ i+1 = -eri(T)z m+n _j +1 /ori = 1, . . . ,min(ra,n). 
See Problem 5. 

Theorem 4.10.7 Let U, V be m and n-dimensional IPS over C re- 
spectively. Let T : V — > U be a linear operator. Then for each k e 
[l,min(m, n)] n Z 



.10.8) Vo-i(T) = max V^Tg^f^u 



k 

(4. 

" { 1 I ( : ■ I ' i i I .. I . ! o' i f ; It / . V I 

1=1 

max > |(Tgi, fj)u I • 

{f 1 ,...,f fe }ePr(/ s ,u),{ gl ,..., gfc }eFr(/ s ,v)^-' IN 

z— 1 
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Furthermore ^2i = i°~i(T) = Ei=i ^(^~gi> f»)u / or some £wo k-orthonormal 
frames F k = {f lt ...,f k },G k = {gi,...,gfc} i/anrf on/?/ span ((gi,fi), . . . , (gk,fk)) 
is spanned by k eigenvectors of T corresponding to the first k eigenvalues 
off. 

Proof. Assume that {fi,...,f fe } e Fr(fc, U), {g 1; g fc } e Fr(fc,V). 
Let w, := -^(gi,fi),i = i,...,k. Then { Wl ,...,w fc } e Fr(fc,W). A 

straightforward calculation shows Ei=i(^ w i! w i)w = Ei=i ^(^gi: f»)u- 
The maximal characterization of Ei=i Aj(T), (Theorem 4.4.8), and (4.10.6) 
yield the inequality Ei=i ^ Ei=i ^(Tgi, fi)u for ft £ [min(m, n) nZ. 

Let d, . . . ,c min ( TO; „), d 1 ,...,d min ( min ) satisfy (4.9.5). Then Lemma 4.10.6 

yields that T,i=i^i( T ) = Ej=i K(Tci, d;}u for fc G [min(m,n) f~l Z. This 
proves the first equality of (4.10.8). The second equality of (4.10.8) is 
straightforward. (See Problem 6).) 

Assume now that Ei=i°*CO — Ei=i %l{Tgi, fj)u f°r some two fc- 
orthonormal frames F k = {fi, ...,f fc },G fe = {gi, ...,gfc}. Define w 1; . . . ,w fe 
as above. The above arguments yield that Ej=i (Twi, Wj)w = Ei=i Aj(T'). 
Theorem 4.4.8 yields that span ((gi, fi), . . . , (gk, fk)) is spanned by k eigen- 
vectors of T corresponding to the first k eigenvalues of T. Vice versa, 
assume that {fi, f k } e Fr(fc, U), {g 1; g k } € Fr(fc, V) and 
span ((gi, fi), . . . , (gk, fk)) is spanned by k eigenvectors of T corresponding 
to the first k eigenvalues of T. Define {w 1; . . . ,w k } e Fr(W) as above. 
Then span (w 1; . . . , Wfc) contains k linearly independent eigenvectors cor- 
responding to the the first k eigenvalues of T. Theorem 4.4.8 and Lemma 
4.10.6 yield that a t (T) = ELi(^ w *)w = EL ^( t ^ U)v- □ 

Theorem 4.10.8 U,V be m and n dimensional IPS spaces. Assume 
that Let S, T : V — > U be linear operators. Then 

min(m.n) 

(4.10.9) Rtr(ST) < ^ a t {S)a t {T). 

»=1 

Equality holds if and only if there exists two orthonormal set {d l5 . . . , d m i n ( TOi „)} G 

Fr(min(m, n), U), {c 1; . . . , c m i n ( TOi „)} € Fr(min(m, n), V), suc/i i7iat 

(4.10.10) 

Sci = cr i (5)d i ,Tc i = o-i(T)di,S*di = cr^S)^, T*d, = o-j(T)cj,i = i, . . . ,min(ra,n). 

Proo f. Let A , B g C nxm . Then 
trB*A = t7AB*. Hence 23? trAB* = tr H(A)H(B). Therefore 23? tr S*T = 
tr ST. Use Theorem 4.8.1 for S,f and Lemma 4.10.6 to deduce (4.10.9). 
Equality in (4.10.9) if and only if tr ST = E™t™ A t (S')A i (f ). 
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Clearly, the assumptions that {d l7 . . . , d m ; n ( m „)} <E Fr(min(m, n), U), 
{d, . . . ,c m i n ( mi „)} G Fr(min(m, n), V), and the equalities (4.10.10) imply 
equality in (4.10.9). 

Assume equality in (4.10.9). Theorem 4.8.1 and the definitions of S,T 
yields the existence {d 1; . . . ,d min(mi „)} e Fr(min(m, n), U), {c 15 . . . , c min(m n) 
Fr(min(m,n),V), such that (4.10.10) hold. □ 



Theorem 4.10.9 Let U and V be finite dimensional IPS over ¥ 
, C. Let T : V — > U be a linear operator. Then 



(4.10.11) min \\T-Q\\ F =. 

QGL fc (V,U)" ^. 



rank T 

E of(T), k= l,...,rankT-l. 

-k+l 



Furthermore \\T - Q\\ F = y Yn^k-H a i i^) f or some Q € L fe (V,U),/c < 
rank T, if and only there Q — Tk, where Tk is defined in Definition 4-9.7. 



has 



Proof. Use Theorem 4.10.8 to deduce that for any Q e L(V, U) one 

\\T -Q\\ 2 F = trT*T -2ntrQ*T + trQ*Q > 
"f^a^(T) -2£>(I>i(Q) + X>, 2 (Q) = 

l—l i—1 l — l 

h rank T rank T 

J2(o- i (T)-o- i (Q)f+ E ^ T )^ E 

i=l i=k+l i=k+l 

Clearly ||T - T fe ||| = ES+T °f ( T )- Hcnce (4.10.11) holds. Vice versa if 

Q e L fe (V,U) and ||T - = Ei=fc+T °f ( T ) thcn thc equality case in 
Theorem 4.10.8 yields that Q = T k . □ 



Corollary 4.10.10 Let F = R, C and A E F mx ". Then 



(4.10.12) min \\A-B\\ F 

BGK m ,„, fc (F) 



rank A 



, E ^C^); fc=l,...,rankA-l. 

\ »=fe+i 



Furthermore \\A — B\\ F = Y^i=k+i a i ( ^) / or some 

(F),fc < 

rank A, z/ and on/?/ £/iere £? = Afe, where Ak is defined in Definition 4-9.5. 
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Theorem 4.10.11 Let ¥ = R, C and A e ¥ mxn . Then 

j k+j 

(4.10.13) Bgw mm 5>(A-B)= ]T 

i=l i=fc+l 

j = 1, ...,min(m, n) — fc, fc = 1, . . . , min(ra, n) — 1. 

Proof. Clearly, for B = Ak we have the equality J2i=i a i(A — B) = 
T,i=k i +1 <Ti{A). Let B G ft m ,„,fc(F). Let X e Gr(fc,C") be an subspace 
which contains the columns of B. Let W = {(0 T ,x T ) T e F m +",x e X}. 
Observe that for any z g W- 1 one has the equality z*H((A — B))z = 
z*H(A)z. Combine Theorems 4.4.9 and 4.10.1 to deduce J2l=i <?i(B-A) > 



Theorem 4.10.12 Let V be an n- dimensional IPS over C. Let T : 
V — > V &e a linear operator. Assume the n eigenvalues ofT Xi(T), . . . , A„(T) 
are arranged the order \Xi(T)\ > ... > |A„(T)|. Let X a (T) := (|Ai(T)|, . . . , |A n (T)|), 
cr(T) := Oi(T), . . . ,cr„(T)). T/ien A (T) d cr(T). Tftat is 

fc fc 
(4.10.14) £|Ai(r)l<S^(r). i=l,..-,n. 

i=l i=l 
Furthermore, £* =1 = Eti ^( T ) 

some fc e [1, n] fl Z z/ and only 
if the following conditions are satisfied. There exists an orthonormal basis 
x lr ..,x n of V such that: 

1. Tx 4 = A;(T)x;, T*x, = A,(T)x 4 /or i = 1, . . . , fc. 

2. Denote by S : U — > U the restriction of T to the invariant subspace 
U = span (x fe+1 , . . . ,x„). Then \\S\\ 2 < |A fe (T)|. 

Proof. Use Theorem 4.2.12 to choose an orthonormal basis g 17 . . . , g„ 
of V, such that T is represented by an upper diagonal matrix A = [a^] G 
C" x " such that a u = X t (T),i = 1, . . . ,n. Let e 4 e C, |e;| = 1 such that 
£iXi(T) = |Aj(T)| for i = 1, . . . , n. Let S e L(V) be presented in the basis 
gi, . . . , g n by a diagonal matrix diag(ei, . . . , e*,, 0, . . . , 0). Clearly, ai(S) = 1 
for i = 1, . . . , fc and <Ji(S) = for i = fc+ 1, . . . , n. Furthermore, 3?tr S*C = 
E» fe =i Hence Theorem 4.10.8 yields (4.10.14). 

Assume now that Eti \ X i( T )\ = £*=i Hence equality sign holds 

in (4.10.9). Hence there exists two orthonormal bases {c 1; . . . , c„}, {d^ . . . , d„} 
in V such that (4.10.10) holds. It easily follows that {c 1; . . . , c^}, {d l7 . . . , d^} 
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are orthonormal bases of W := span (gi,...,gfc). Hence W is an in- 
variant subspace of T and T*. Hence A = A\ ® A 2 , i.e. A is a block 
diagonal matrix. Thus A x = (a l3 ) k j=1 e C kxk ,A 2 = (o»j)"j=fe+i e 
(Q(n-fc)x(n-fe) re p resen t the restriction of T to W,U := W- 1 , denoted by 
Ti and T 2 respectively. Hence <7j(Ti) = o-j(T) for i = 1, . . . , fc. Note that 
the restriction of 5 to W, denoted by S\ is given by the diagonal matrix 
D 1 := diag(ei,...,e fc ) G U(fc). (4.10.10) yield that Sf 1 ^^ = cr^T)^ for 
i = l,...,k,i.e. <Ti(T), . . . , (Tfc(T) are the eigenvalues of S^ 1 ^. Clearly 
Si 1 Ti is presented in the basis [g 1; . . . ,gk] by the matrix Ai, which is 
a diagonal matrix with |Ai(T)|, . . . , |Afc(T)| on the main diagonal. That is 
S^Tt has eigenvalues |Ai(T)|, . . . , |A fe (T)|. Therefore a t {T) = \Xi{T)\ for 
i = l,...,k. Theorem 4.9.6 yields that 

tr A{A 1= J2 K\ 2 = J2aUA 1 ) = J2aUT 1 )^J2\X t (T)\". 

i—1 i—1 i—1 

As Ai(T), . . . , Afe(T) are the diagonal elements of A\ is follows from the 
above equality that A\ is a diagonal matrix. Hence we can choose Xj = gi 
for i = 1, . . . , n to obtain the part 1 of the equality case. 

Let Tx = Ax where ||x|| = l and p(T) = |A|. Recall ||T|| 2 = oi(T), 
where u\{T) 2 — \\(T*T) is the maximal eigenvalue of the self-adjoint 
operator T*T. The maximum characterization of X\(T*T) yields that 
|A| 2 = (Tx,Tx) = (T*Tx,x) < X 1 (T*T) = \\T\\ 2 2 . Hence p(T) < ||T|| 2 . 

Assume now that p(T) = \\T\\ 2 . p(T) = then ||T|| 2 = T = 0, 
and theorem holds trivially n this case. Assume that p(T) > 0. Hence 
the eigenvector x ± := x is also the eigenvector of T*T corresponding to 
Xi(T*T) = |A| 2 . Hence |A| 2 x = T*Tx = T*(Ax), which implies that 
T*x = Ax. Let U = span (x)- 1 be the orthogonal complement of span (x). 
Since Tspan (x) = span (x) it follows that T*U C U. Similarly, since 
T*span (x) = span (x) TU C U. Thus V = span (x) ® U and span (x), U 
are invariant subspaces of T and T* . Hence span (x) , U are invariant sub- 
spaces of T*T and TT* . Let T x be the restriction of T to U. Then T-jTi is 
the restriction of T*T. Therefore ||Ti||| = Ai(Tj * T x ) > Xi(T*T) = \\T\\j. 
This establishes the second part of theorem, labeled (a) and (b). 

The above result imply that the conditions (a) and (b) of the theorem 
yield the equality p(T) = \ \T\ \ 2 . □ 



Corollary 4.10.13 Let U be an n- dimensional IPS over C. Let T : 
U -> U be a linear operator. Denote by |A(T)| = (|Ai(T)|, |A„(T)|) T the 
absolute eigenvalues of T , (counting with their multiplicities), arranged in 
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a decreasing order. Then |A(T)| = (<7i(T), ...,cr„(T)) T if and only if T is a 
normal operator. 

Problems 

1. Let the assumptions of Corollary 4.10.2 hold. 

(a) Since rank A < rank A show that the inequalities (4.10.4) reduce 
to o-i(A) = o-i(A) — for i > rank A. 

(b) Since H(A) is a submatrix of H(A) use the Cauchy interlacing 
principle to deduce the inequalities (4.10.4) for i = 1, . . . , rank A. 
Furthermore, if p' := m — f/=a, q' = n — then the Cauchy in- 
terlacing principle gives the complementary inequalities o~i(A) > 
o-i+ P '+q'(A) for any i E N. 

(c) Assume that (Ji(A) = ai(A) for i = 1, . . . , I < rank A. Compare 
the maximal characterization of the sum of the first k eigenvalues 
of H(A) and H(A) given by Theorem 4.4.8 for k = 1, . . . , / to 
deduce the last part of Corollary (4.10.2). 

2. Prove Corollary 4.10.3 by choosing any orthonormal basis in U, an 
orthonormal basis in V whose first dim W elements span W, and 
using Problem 1. 

3. Combine Theorems 4.7.15 and 4.10.1 to deduce Theorem 4.10.4. 

4. (a) Prove Corollary 4.10.5 

(b) Recall the definition of a norm on a vector space over F = M, C 
7.4.1. Show that the function / defined in Corollary 4.10.5 is a 
norm. For p = min(m, n) and w\ — ... = w p = 1 this norm is 
called the q — Schatten norm. 

5. Prove Lemma 4.10.6. 

6. Under the assumptions of Theorem 4.10.7 show. 

(a) 

k 

max V3?(Tg i ,f l ) u = 

{f 1 ,...,f fe }eFr(fe,U) ) {g 1 ,..., gfc }ePr(fe,V)^' 

k 

max y^\(Tgi,fi) v \. 

{fi,...,ffc}eFr(fc,U),{g 1 ,...,g fe }eFr(fc,V)-f-' 
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(b) For tui > . . . > w k > 

k k 

y^w i (T l (T)= max V" w^Tgi, fj)u- 

f-f {fi,...,f*}€Fr(fc.U),{g 1 ,...,gfc}ePr(fc,V)-f-' 
i—i t—1 

7. Under the assumptions of Theorem 4.10.7 is it true that that for fc > 1 

k k 

Y2<7i{T)= max V||rfi||v. 

2—1 2—1 

I doubt it. 

8. Let U,V be finite dimensional IPS. Assume that P, T G L(U,V). 
Show that mr(P*T) > - £™( m >") ai {S)ai{T). Equality holds if 
and only if S = —P and T satisfy the conditions of Theorem 4.10.8. 

4.11 Moore-Penrose generalized inverse 

Let A G C mx ". Then (4.9.12) is called the reduced SVD of A. It can be 
written as 

A=U r Z r V r *, r = rankA, S r := diag(cn(A), . . . , <r r (A)) G S r (K), 
(4.11.1) 

U r = [u 19 . . . ,u r ] G c mxr , v; = [v 1; . . . , v r ] G c nxr , u*u r = v;v r = I r , . 



Recall that 



AA*u, = a^Afui, A*Avi = <7i(A) 2 Vi, 
i i 

-A*\ii, Uj = ——Av l , i = i, . . . , r. 



" <Ji{A) a t (AY 
Then 

(4.11.2) A^ := Vr^U; G C nxm 

is the Moore-Penrose generalized inverse of A. If A G R mxn then we assume 
that U G R mxr and V <E R nxr , i.e. U, V are real values matrices over the 
real numbers R. 

Theorem 4.11.1 Let A G C mx ™ matrix. Then the Moore-Penrose 
generalized inverse A^ G C™ xm satisfies the following properties. 

1. rank A = rank A^. 
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2. A^AA^ = A\ AA^A = A, A*AA^ = A^AA* = A*. 

3. A^ A and AA^ are Hermitian nonnegative definite idempotent matri- 
ces, i.e. (A^A) 2 = A^A and {AA^) 2 = AA^ , having the same rank as 
A. 

4- The least square solution of Ax = b, i.e. the solution of the system 
A* Ax = A*b, has a solution y = A^b. This solution has the minimal 
norm \ \y\\, for all possible solutions of A* Ax = A*b. 

5. //rank A = n then A^ = (A* A)- 1 A*. In particular, if A E C nxn is 
invertible then A^ = A^ 1 . 

To prove the above theorem we need the following proposition. 

Proposition 4.11.2 Let E e C ixm , G e C mx ". Then 
rank EG < min(rank E, rank G). If I = m and E is invertible then rank EG 
rar k G. If m = n and G is invertible then rank EG = rank E. 

Proof. Let e lr ..,e m e C',g l7 ...,g„ G C m be the columns of 
E and G respectively. Then rank E = dim span (e 1; e;). Observe 
that EG = [Eg ± , . . . , Eg n ] e C lxn . Clearly Egi is a linear combina- 
tion of the columns of E. Hence Egi S span (e 15 . . . , e;). Therefore 
span (Eg 1; . . . , Eg n ) C span (e lr .. ,ej), which implies that rank EG < 
rank E. Note that (EG) T = G T E T . Hence 
rank EG = rank (EG) T < rank G T = rank G. Thus 

rank EG < min(rank E, rank G). Suppose E is invertible. Then rank EG < 
rank G = rank E _1 (EG) < rank EG. Hence rank EG = rank G. Similarly 
rank EG = rank E if G is invertible. □ 

Proof of Theorem 4.11.1. 

1. Proposition 4.11.2 yields that rank A f = rank VrE^U* < rank E^U* < 
rank S7 1 = r = rank A. Since S r = V*AW r Proposition 4.11.2 
yields that rank A^ > rank S^ 1 = r. Hence rank A = rank AL 

2. AAt = (J7 r S J .^ r *)(V;S- 1 J7;) = U^^U; = U r U;. Hence 

aa^a = {u r u;){u r T, r v*) = u r zv; = A. 

Hence A* AA^ = (V r Y, r U;)(U r U*) = A*. Similarly A* A = V r V r * and 
A^AA* = A\A^AA* = A*. 
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3. Since AA* = U r U; we deduce that (AA^)* = {U r U;)* = {U;)*U; = 
AA\ i.e. AA* isHcrmitian. Next (AA^) 2 = {U r U;) 2 = {U T U;){U T U;) 
(U r U*) = AA\ i.e. AA^ is idempotent. He ;nce AA^ is nonnegative 
definite. As AA^ — U r I r U* , the arguments of part 1 yield that 
rank AA* = r. Similar arguments apply to A^A = V r V* . 

4. Since A* AA^ = A* it follows that A* A{tfb) = A*h, i.e. y = A+b is 
a least square solution. It is left to show that if A* Ax = A*h then 
||x|| > ||^b|| and equality holds if and only if x = A^b. 

We now consider the system A* Ax = A*h. To analyze this system 
we use the full form of SVD given in (4.9.7). It is equivalent to 

{VY?U*){UY,V*)x = VE T U*b. Multiplying by V* we obtain the 
system S T S(y*x) = S T {U*h). Let z = (z ± , . . . , z n ) T := V*x, 

c = (d,...,c m ) T := U*h. Note that z*z = x*VVx = x*x, i.e. 
||z|| = ||x||. After these substitutions the least square system in 
Zi,...,z n variables is given in the form ai(A) 2 Zi = <Ti(A)ci for i = 
l,...,n. Since <Ji(A) = for i > r we obtain that Zi = a ^ A ^ Ci 
for i = 1, . . . , r while z r+ i, . . . , z n are free variables. Thus ||z|| 2 = 
Si=i <r-(A)* S"=r+i \ z i\ 2 - Hence the least square solution with the 



minimal length ||z|| is the solution with z t = for i = r + 1, . . . , n. 
This solution corresponds the x = A^b. 

5. Since rank A* A = rank A = n it follows that A* A is an invcrtiblc 
matrix. Hence the least square solution is unique and is given by 
x = (A*A)- 1 A*b. Thus for each b one has (A*A)~ 1 A*b = A^b, 
hence A^ = (A* Ay 1 A*. 

If A is an n x n matrix and is invertible it follows that (A*A)~ 1 A* = 
A- x {A*)- l A* = A' 1 . □ 

Problems 

1. P e C" x ™ is called a projection if P 2 = P. Show that P is a projection 
if and only if the following two conditions are satisfied: 

• Each eigenvalue of P is cither or 1. 

• P is a diagonable matrix. 

2. P e R nxn is called an orthogonal projection if P is a projection and 
a symmetric matrix. Let V C W L be the subspace spanned by the 
columns of P. Show that for any a e MP, b e ¥ | |a - b| | > | |a - Pa| | 
and equality holds if and only if b = Pa. That is, Pa is the orthogonal 
projection of a on the column space of P. 
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3. Let A £ R mxn and assume that the SVD of A is given by (4.9.7), 
where U £ 0(m,R),V £ 0(n,M). 

(a) What is the SVD of A T 1 

(b) Show that CA T )t = (At) T . 

(c) Suppose that B £ R lxm . Is it true that (BA)^ = A^Bfil Justify! 

4.12 Approximation by low rank matrices 

We now restate Theorem 4.10.8 in matrix terms. That is we view A,B £ 
C mxn ag lincar p erators a, B : C™ -> C m , where C m , C" are IPS equipped 
with the standard inner product. 

Theorem 4.12.1 Let A,B £ C mxn , and assume thata^A) > a 2 (A) > 
... > 0,cti(B) > a 2 (B) > ... > 0, w/iere a,(A) = and cr^B) = for 
i > rank A and j > rank B respectively. Then 

m m 

(4.12.1) -^^(^(B) < KtrAB* <Y< A X B )- 

i=l i=l 

Equality in the right-hand side holds if and only if C",C m /iaue £wo or- 
thonormal bases [c 1; . . . , c n ], [d 1; . . . , d TO ] smc/i i/iat (4-10.10) is satisfied 
for T = A and S = B. Equality for the left-hand side holds if and only 
if C",C m have two orthonormal bases [c 15 . . . , c„], [d 1; . . . , d m ] such that 
(4.10.10) is satisfied for T = A and S = -B. 

Theorem 4.10.9 yields: 

Corollary 4.12.2 For A £ C mxn Let A k be defined as in Definition 

4.9.5. Thenmm Benmnk{r) \\A-B\\% = \\A-A k \\ 2 = Y,Zk+i^( A ) 2 - A k 
is the unique solution to this minimal problem if and only if I < k < rank A 
and o k {A) > a k+1 (A). 

We now give a generalization of Corollary 4.10.9. Let A £ <C mxn and 
assume that A = U A Y, A VX be the SVD of A given in (4.9.7). Let U A = 
[Ui u 2 . . . u m ], V A = [vj v 2 ... v n ] be the representations of U, V in terms 
of their m, n columns respectively. Then 

rank A 

(4.12.2) P AMt := ]T e C mxm , PA.Hght 



rank A 

i — 1 



are the orthogonal projections on the range of A and A*, respectively. 
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Theorem 4.12.3 Let A E C mxn , C e C mxp , R e C 9Xn be given. Then 
X = C^(Pc,i e ft^4-PR,right)fe-R^ is a solution to the minimal problem 

(4.12.3) min \\A-CXR\\ F , 

xeTZ( P , q .k) 

having the minimal \\X\\p. This solution is unique if and only if either k > 
rank P c , loft APR ir i g ht or 1 < k < rank P c , left AP R>rig ht and a k (P c ,i c [tAP R y lght ) > 

°~ fc+l (■Pc,left^-PR,right) • 

Proof. Assume that C = U c ^cV£,R = U R H R V^ are the SVD de- 
composition of C, R, respectively. Recall that the Frobenius norm is in- 
variant under the multiplication from the left and the right by the corre- 
sponding unitary matrices. Hence \\A — BXC\\f = \\A — T,cXT, R \\, where 
A := UqAVr,X := VqXUr. Clearly X and X have the same rank and 
the same Frobenius norm. Thus it is enough to consider the minimal prob- 
lem min^ gK( . p q k ^ \\A — Y,cXY,r\\f. Let s = rankC,t = rank R. Clearly 
if C or R is a zero matrix, then X = pxq is the solution to the minimal 
problem (4.12.3). In this case either Pc.ieft or Pr, right are zero matrices, 
and the theorem holds trivially in this case. 

It is left to consider the case 1 < s, 1 < t. Define C\ := diag(<7i(C), . . . , cr s (C)) € 
C sxs ,i?i := diag(CTi(i?),...,cr t (i?)) € C txt . Partition A and X to 2 x 2 
block matrices A = [Aij]f J=1 and X = [X i3 ]f - =1 , where A\\,X\\ e C sx *. 
(For certain values of s and t, we may have to partition A or X to less than 
2x2 block matrices.) Observe next that Z := Y,cXY. R = [Zij]f j =1 , where 
Zn = CiXuRx and all other blocks Z i3 are zero matrices. Hence 

11^111 = 11^-^111+ y, ii^iil->ii^ii-(^ii)fciiF+ E Mf- 

2<i+j<4 2<i+j<i 

Thus X = [X i3 ]j J=1 , where I n = C^ 1 {A 11 ) k R^ 1 and X tJ = for all 

(hj) (1j 1) is a solution min^ gK( < p q k ^ \ \A — T,cXT, R \ \p with the minimal 
Frobenius form. This solution is unique if and only if the solution Z\\ = 
(An)fe is the unique solution to m\n Zll <zn{s,t,k) WAu — ZuWp. This happens 
if either k > rank A n or 1 < k < rank A n and cr^An) > ak+i(An). A 
straightforward calculation shows that X = S^(PE Ci i e f t APE H ,right)fe5]fl- 
This shows that X = (P c ,icttAP R , ligh t)kK* is a solution of (4.12.3) 
with the minimal Frobenius norm. This solution is unique if and only 
if cither k > rank Pcjcft AP R , r i g ht or 1 < k < rank Pc.ieft AP R , r i g ht and 

ffc(Pc,loft^PR,right) > Cfe + i(Pc,left^4-PR,right)- D 
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Corollary 4.12.4 Let the assumptions of Theorem 4-12.3 hold. Then 
X = C^AR^ is the unique solution to the minimal problem minxec** — 
CXR\\ F with the minimal Frobenius norm. 

Theorem 4.12.5 Let a 1: . . . ,a n G C m and fee [1, m - 1] l~l N be given. 
Let A = . . . a„] G C mx ™. Denote by L k G Gr(fc,C m ) a k- dimensional 
subspace spanned by the first k left singular vectors of A. Then 



Proof. Let L G Gr(fc, C m ) and b 15 . . . , b„ e L. Then B := [b ± . . . b n ] G 
lZ(m, n, k). Vice versa, given B G lZ(m, n, k) then the column space of B 
is contained in some L G Gr(fc,C m ). Hence X^=i ll a » — ^dla = 11^ — ^1 la- 
Corollary 4.12.2 implies that the minimum stated in the left-hand side of 
(4.12.4) is achieved by the n columns of Ak- Clearly the column space of 
A is equal to Lk- (Note that Lk is not unique. See Problem 3.) □ 

Problems 

1. Let A G S n (M) and assume the A = Q T AQ, where Q G 0(n,K) and 
A = diag(ai, . . . , a n ) is a diagonal matrix, where |«i| > . . . > \a n \ > 
0. 

(a) Find the SVD of A. 

(b) Show that a x {A) = max(Ai(A), \\ n (A)\), where \ X (A) > ... > 
X n (A) are the n eigenvalues of A arranged in a decreasing order. 

2. Let k,m,n be a positive integers such that k < min(m, n). Show 
that the function / : R mxn : [0,oo) given by f(A) = Y% =1 Oi{A) is a 
convex function on R mx ™. 

3. Show that the minimal subspace for the problem (4.12.4) is unique if 
and only if <r k (A) > a k+ i(A). 

4. Prove Corollary 4.12.4. 

4.13 CL7i?-approximations 

Let A = (a^)™;™! G C mx ", where m,n are big, e.g. m,n > 10 6 . Then 
the low rank approximation of A given by its SVD has prohibitively high 



n 



n 



(4.12.4) 
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computational complexity and storage requirements. In this section we 
discuss a low rank approximation of A given by of the form CUR, where 
C £ M. mxp , R £ M. qxn are obtained from A by reading p, q columns and rows 
of A, respectively. If one chooses U as the best least squares approximation 
given by Corollary 4.12.4 then U = C^ AR) . Again, for very large m, n this 
U has too high computational complexity. In this section we give different 
ways to compute U of a relatively low computational complexity. 
Let 

I = {1 < oti < . . . < a q < to} C (to), J = {1 < (3i < . . . < P p < n} C (n) 

be two nonempty sets of cardinality q, p respectively. Using the indices in 
I, J, we consider the submatrices 

Au = (a akl3l )l% 1 £C^, 
(4-13.1) R = A I{n) = {a akj )l n j=1 £ C«x", 

C = A {m)J = (a i ^ 1 £C m ^. 

Thus, C = A( m }j and R = Aj^ are composed of the columns in J and 
the rows / of A, respectively. The read entries of A are in the index set 

(4.13.2) S := (to) x (n)\(«m)\7) x «n)\J)), #5 = mp + qn - pq. 

We look for a matrix F = CUR £ C mx ™, with U £ C pxq still to be 
determined. We determine U op t as a solution to the least square problem 
of minimizing J2(ij)es \ a v ~ (CUR) lj \ 2 , i.e., 

(4.13.3) t/ op t = arg min ? ^ \a tJ - (CUR) i} \ 2 . 

It is straightforward to see that the above least squares is the least squares 
solution of the following overdetermined system 

(4.13.4)TC7 = A, T = (i W )( M) ) G £(rn P+qn - Pq ), Pq ^ t(i m i) = a , kaih 

U = (« (fciI) ) €C",i= (a m ) £ C""^-", £ 5, (k,l) £ (p) X (q). 

Here U, A is viewed as a vector whose coordinates are the entries of U and 
the entries of A which are either in C or R. Note that T is a corresponding 
submatrix of A® A. 

Theorem 4.13.1 Let A £ C mxn , and let I C (to), J C (n) have car- 
dinality q and p, respectively. Let C = A^j £ C mxp , and R = Aj^ £ 
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C pxn be as in (4-13.1) and suppose that Ajj is invertible. Then the overde- 
termined system (4-13.4) has a unique solution U = Ajj, i.e., the rows in 
I and the columns in J of the matrix CAjjR are equal to the corresponding 
rows and columns of A, respectively. 

Proof. For any I C (m), J C (n), with #7 = q, # J = p, and U G C mxn 
we have the identity 

(4.13.5) (A {m} jUA I{n} ) u = AuUAu. 

Hence the part of the system (4.13.4) corresponding to (CUR)u = Ajj 
reduces to the equation 



(4.13.6) AjjUAu = Ajj 

If Ajj is a square matrix and invertible, then the unique solution to this 
matrix equation is U = Ajj. Furthermore 

(A( m ) J Ajj = AjjAjjAj( n ) = A/<„), 

(A( m )jAjjAj^)^j = A( m )jAjjAu = A( m )j. 

□ 

This results extends to the general nonsquare case. 

Theorem 4.13.2 Let A G C mxn , and let I C (m), J C (n) have 
cardinality q and p, respectively. Let C — A^j G <C mxp , and R = Aj^ G 
C pxn be as in (4-13.1). Then U = A\ 7 is the minimal solution (in Frobenius 
norm) of (4-13.3). 

Proof. Consider the SVD decomposition of Ajj 

A u = WY.V*, W eC qxq , V eC pxp , £ = diag(ai, . . . ,a r ,0, . . . ,0) G Rf p , 

where W, V are unitary matrices and o\ , . . . , oy are the positive singular 
values of Ajj. In view of Theorem 4.13.1 it is enough to assume that 
max(p, q) > r. W.l.o.g. we may assume that 7 = (q), J = (p). Let 

W = | ^ Oqrx(m-g) \ g £tnxm 

\ 0(m— q)xq ^m—q J 

y — | ^ px („_ p ) 

\ 0(n— p)xp 7„_p 

Replace A by Aj = W\AVJ[. It is easy to see that it is enough to prove 
the theorem for A\. For simplicity of the notation we assume that A\ = A. 
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That is, we assume that Ajj = S r ®0( g _ r ) X ( p _ r ), where S r = diag(<Ti, . . . , ay) 
and r = rank Aij. For U € C pxq denote by U r e M px,J the matrix ob- 
tained from J7 by replacing the last p — r rows and q — r columns by rows 
and columns of zeroes, respectively. Note that then CUR = CU r R and 
II^tIIf < ||^||f, and equality holds if and only if U = U r . Hence, the 
minimal Frobcnius norm least squares solution U of is given by U — U r . 
Using the fact that the rows r + 1, . . . ,q and columns r + 1, . . . , p of CU R 
are zero it follows that the minimum in (4.13.3) is reduced to the minimum 
on S' = (to) x (r) U (r) x (n). Then, by Theorem 4.13.1 the solution to the 
minimal Frobcnius norm least square problem is given by £t. □ 



For a matrix A define the entrywise maximal norm 

(4.13.7) PI|co, e := max |oy|, A = ( aij ) G C mxn . 

ie(m) ,je(n) 

Theorem 4.13.3 Let A e C mxn ,p e [l.rank A] n N. Define 

(4.13.8) a v := max Idet Au\ > 0. 

IC{m),JC{n),#I=#J=p 

Suppose that 

(4.13.9) |det A u \ >5^ p ,5e (0,1], Ic (m),Jc (n),#/ = #J = p. 
Then for C,R defined by (4- 13.1) we have 

(4.13.10) ||A - C^jjiJHoce < ^±1<7 P+1 (.4). 

Proof. We now estimate | — (CAjjR)ij\ from above. In the case 
p = rank A, i.e. cr p+1 (A) = 0, we deduce from Problem 1 that — 
(CAjjR) tj = . Assume a p+1 (A) > 0. By Theorem 4.13.1 a l3 -(CAjj R) l3 = 

if either i e I or j e J. It is left to consider the case i € (m)\7, j € (n)\J. 
Let if = 7 U {i}, L = J U {j}. Let 5 = If rank B = p then Problem 

1 yields that B = BkjAjJ Bjk- Hence — (CAjjR)ij = 0. Assume that 
det B ^ 0. We claim that 

(4.13.11) ay - (CLljji^ - ±^^- 

It is enough to consider the case where I = J = (p) , i = j = p + IK = L = 
(p + 1). In view of Theorem 4.13.1 B - B KJ AjjB JL = diag(0, ... ,0,t), 
where t is equal to the left-hand side of (4.13.11). Multiply this ma- 
trix equality from the left by B^ 1 = (b s t,-i)^~tLi- Note that the last 
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row of B x Bki is zero. Hence we deduce that fo( p +i)( p +i),-it = 1, i.e 
t = fcp +1 w p+1 N _ r Use the identity B^ 1 = (dct B) _1 adj B to deduce the 
equality (4.13.11) . 

We now estimate ai(B~ 1 ) from above. Note that each entry of B^ 1 = 
(det B)- 1 adj B is bounded above by |det B| _1 /v Hence ^i{ B ^) < js^sf- 
Recall that a^B' 1 = Op+^B)- 1 . Thus 

^ < M ^ JMi < f^MI . 

/i p | det A/ j | d 

Since £> is a submatrix of A we deduce a p+ i(B) < a p+1 {A). □ 



Problems 

1. Let A G C mx ",rankA = r. Assume that I C (to), J C (ri),#I = 
# J = r. Assume that det A u ^ 0. Show that A = CAjjR. 



4.14 Some special maximal spectral problems 

Let S C V. Then the convex hull of S, denoted by convS, is the minimal 
convex set containing S. Thus H n := conv{e 1; e„} C K™, where e 1; e„ 
is the standard basis in R™, is the set of probability vectors in R™. 

Assume that C C V is convex. A point e G C is called an extremal 
point if for any xjeC such that e G [x, y] the equality x = y = e holds. 
For a convex set C denote by ext C the set of the extremal points of C. It 
is known that ext conv S C S [Roc70] or see Problem 1 . 

Definition 4.14.1 Let S C V. For each j G N let 

3 

convj_iS = {z G V : z = J^p^, for all p = (p ± , ■■■,p J ) T G LTj, x l7 ...,Xj G 

i— i 

.Lei C be a convex set in V. Assume that ext C =/= ®. Then Fj_i(C) := 
convj_i(ext C) is called j — 1 dimensional face of C for any j G N. 

Suppose that S is a finite set of cardinality N G N. Then convS = 
convN-iS, see Problem 1, and conv S is called a finitely generated convex 
set. Note that F (S) = S. The following result is well known [Roc70]. (See 
Problem 1 for finitely generated convex sets.) 
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Theorem 4.14.2 Let V be a vector space over real of finite dimension. 
Let C C V be a nonempty compact convex set. Then convext C = C. Let 
d := dim C. Then F d (ext C) = C. More general, for any S C V let 
d = dim convS. Then Fd(S) = convS. 

Throughout this book we assume that V is finite dimensional, unless stated 
otherwise. In many case we shall identify V with R d . Assume that the 
C C V is a nonempty compact convex set of dimension d. Then the 
following facts are known. If d = 2 then cxt C is a closed set. For d > 3 
there exist C such that ext C is not closed. 

The following result is well known (see Problem 3): 

Proposition 4.14.3 Let S C V and assume that f : convS — > R is a 
convex function. Then 



If in addition S is compact and f is continuous on conv S then one can 
replace sup by max. 

See Problem 4 for a generalization of this proposition. 

Corollary 4.14.4 Let V be a finite dimensional IPS over ¥ = R, C. 
Let S C S(V). Let f : convS — > R be a convex function. Then 



The aim of this section to give a generalization of this result to certain 
spectral nonconvex functions /. 

Definition 4.14.5 For x = x n ) T , y = (y 1 , y n ) T e R n let 

x < y <^=> Xi < yi, i = l, ...,n. Let D C K™ and /:£)—» M. / is called 
a nondecreasing function on D if for any xjefl one has the implication 
x<y^/(x)</(y). 

Theorem 4.14.6 Let V be an n-dimensional IPS over R. Let p e 
[l,n] n N and D C 6e a convex Schur set. Let D p be the projec- 

tion of D on the first p coordinates. Let h : D p — ► R and assume that 
f : X^ 1 (D) — > K 6e £/ie spectral function given by A i— > /i(A( p )(j4)), where 
A( p )(j4) = (Ai(.A), Aj,(A)) T . Lef S c A _1 (D). isswrne t/iat /i is nonde- 
creasing on D p . Then 



sup /(x) = sup/(y). 



xGconvS yGS 



sup /(A) = sup /(B). 



AGconvS ses 



(4.14.1) 



sup /(A) 



sup 

Beconv/ p+ i\S 



f(B) 



AGconv S 



and this result is sharp. 
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Proof. Since dim convS < dim S(V) = C^ 1 ) Theorem 4.14.2 implies 
that it is enough to prove the theorem in the case S = T := {Ai, An}, 
where N < ("J 1 ) + 1- Observe next that since D is a convex Schur set and 
S C A-^D) it follows that convS C A _1 (D) (Problem 6). 

Let A £ convT. Assume that X i j • ■ • 7 ~X-p 211*6 p- orthonormal eigenvectors 
of A corresponding to the eigenvalues Xi(A), X P (A). For any B £ S(V) 
let B(x 1; ...,x p ) := ((Bx il x i ))f ij=1 £ S p (R). We view S p (R) as a real vec- 
tor space of dimension ( p '£ 1 ) • Let T' := {Ai(x 1 , x p ), An(x ± , ...,x p )} C 
S P (R). It i straightforward to show that for any B £ convT one has 
B(x 1 ,...,x p ) £ convT'. Let T the restriction of convT' to the line in 
S P (R) 

{X = (xij) £ S P (R) : Xij = \i{A)5ij, for i + j > 2}. 

Clearly ^4(x 1; x p ) £ T. Hence T = [C(x l7 ...,x p ), -D(x 1; x p )] for some 
C,D £ convT. It is straightforward to show that C,D £ conv/ P +n_ 1 T. 

See Problem 1. Hence max Xe f xu = max((Cx 1 ,x 1 ), (Dx l7 x 1 )). Without 
loss of generality we may assume that the above maximum is achieved 
for the matrix C. Hence C(x 1 ,...,x p ) is a diagonal matrix such that 
Ai(C(x 1 , ...,x p ) > X 1 (A) and A i (C(x 1 , ...,x p ))) = A,(A) for i = 2,..., p. 
Let U = span (x 1; ...,x p ). Since x 1; ...,x p are orthonormal it follows that 
Aj(Q(C,U)) = A i (C(x 1 , ...,Xp)) for i = I,..., p. Corollary 4.4.7 yields 
that A( p )(C) > A( p )(A). Since h is increasing on D we get h{\ p )(C)) > 
h(\{p)(A)). See Problem 7 which shows that (4.14.4) is sharp. □ 

Theorem 4.14.7 Let V be an n- dimensional IPS over C. Let p £ 
[l,n] n N and D C be a convex Schur set. Let D p be the projec- 

tion of D on the first p coordinates. Let h : D p — > R and assume that 
f : X^ 1 (D) — > R be the spectral function given by A 1— » h(X^ p - ) (A)), where 
A( p )(A) = (X\(A), X P (A)) T . Let S c A _1 (D). Assume that h is nonde- 
creasing on D p . Then 

(4.14.2) sup f(A)= sup f(B), 

AGconvS -BGconv p 2_ 1 S 

and this result is sharp. 

See Problems 8 for the proof of the theorem. 

It is possible to improve Theorems 4.14.6 and 4.14.7 in special interesting 
cases for p > 1. 
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Definition 4.14.8 Let V be an n- dimensional IPS over ¥ = R, C and 
p € [l,n] n Z. Let A e S(V). T/ien p-upper multiplicity of X P (A), denoted 
by upmul(A,p), is a natural number in [l,p] such that 

Ap- U pmui(A,p)(-4) > A p _ upmu i( A ,p)+i(-4) = ••• = X P (A), where X (A) = oo. 

For any C C S(V) let upmul(C,p) := maxAec upmul(A, p). 

Sec Problem 14 for sets satisfying upmul(C, p) < k for any k G N. 

Theorem 4.14.9 Let V be an n- dimensional IPS over BL Let p e 
[l,n] n N ararf denote n(p) := upmul(conv S, p) . Then 



Proof. For fi(p) = p (4.14.4) follows from Theorem 4.14.6. Thus, it 
is enough to consider the case p > 1 and fi(p) < p. As in the proof of 
Theorem 4.14.6 we may assume that S = {Ai,...,A N } where N < ("^H 1 - 
Let M := {B e convS : A P (B) = maxAeconvS A P (A)}. Since X P (A) 
is a continuous function on S(V) and convS is a compact set it follows 
that M. is a nonempty compact set of convS. Let v := ^( p )( 2p ~^( p ) +1 ) — 
1. Assume to the contrary that the theorem does not hold, i.e. M. n 



conv.S = 0. Let M' := {p = { Pl ,...,p N ) T e V N : £" =1 M G M}. 



Then M! is a nonempty compact set of Vn and any p e A4' at least 
f + 2 positive coordinates. Introduce the following complete order on Vn- 
Let x = (x l7 ...,xn) T ,y = (Vi, ■■■,Vn) T € R w . As in Definition 4.6.1 
let x = (x\ 1 xn) T , y = (j/i, 2Mr) T € be the rearrangements of 
the coordinates of the vectors x and y in the nonincreasing order. Then 
x < y if cither x = y or x, = for i = 0, ...,ra — 1 and x m < y m 
for some m e [1,«] H N. (We assume that x D = y = oo.) Since M! 
is compact there exists a maximal element p — (p l7 ...,pn) t E A4', i.e. 
q e M' => q < p. Let I := {i e (A) : > 0}. Then #X > i/ + 2. Let 
_B = X^i^iPi^j £ convS be the corresponding matrix with the maximal 
X p on convS. Assume that x lr ..,x„ e V be an orthonormal basis of 
V, consisting of the eigenvectors of B corresponding to the eigenvalues 
Ai(-B), X n (B) respectively. Let m :— upmul(_B,p) < p,(p). Consider the 
following systems of m ( 2 P~ m + 1 ) equations in #J unknowns qi e R, i e X: 



(4.14.3) 



sup A p (A) = 



sup 

5econv M(p)(2p _ M(p) + 1) S 



A P (B). 



AGconv S 



q t = 0, for i G (A)\J, ^ ft = 
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^q^AiXj^j) = ^g;(yLx p ,x p ) j =p- i,...,p-m + 1 if m > 1. 

iei iei 

SinCC #J > 1/ + 2 = M(P)(2P-M(P) + D + ! > m(2p-m+l) y. foU(JwB that therc 

exists 0^q = (q 1 , (?at) t G l w whose coordinates satisfy the above equa- 
tions. Let B(t) := B + tC, C := X^Li ft^i, t € R. Then there exists a > 
such that for i G [—a, a] p(t) := p + tq G £V =>■ G convS. As in the 
proof of Theorem 4.14.6 consider the matrix B(t)(x 1 , x p ) G S p (R). Since 
B(0)(x 1 , x p ) = B(x ± , ...,x p ) is the diagonal matrix diag(Ai(_B), X P (B) 
the conditions on the coordinates of q imply that B(t)(x 1 , ...,x p ) is of 
the form (diag(Ai(£?), \ p - m (B)) + tC\) ® (A p + tb)I m for a correspond- 
ing C\ G S P _ TO (R). Since A p _ m (£?) > X p {B) it follows that there exists 
a' G (0, a] such that 

X p - m (B(t)(x 1 , ...,x p )) = A p _ TO (diag(A 1 ( J B),...,A p _ m ( J B)) + td) > 
Ap(B) + A p (B(t)) = X P {B) + tb, for \t\ < a'. 

Hence X p (B(t)) > X p (B) + tb for \t\ < a'. As g convS for \t\ < a' and 
X P (B) > X p (B(t)) for \t\ < a' it follows that b = and X p (B(t)) = X P (B) 
for |t| < a'. Hence p + tq G Al' for |i| < a'. Since q ^ o, it is impossible 
to have the inequalities p — a'q <C p and p + a'q <C p. This contradiction 
proves the theorem. □ 

It is possible to show that the above theorem is sharp in the case /j,(p) = 
1 Problem 13 (d2). Similarly one can show, see Problem 10. 

Theorem 4.14.10 Let V be an n-dimensional IPS over C. Let p G 
[l,n] n N and denote fi(p) :— upmul(conv S, p) . Then 

(4.14.4) sup X P (A) = sup X P (B). 

AGconvS Seconv (J ( p )(2p_ tI (p))_iS 

Problems 

1. Let S C t n be a nonempty finite set. Show 

a. Let S = {x 15 ...,xjv}. Then conv S = convN-i(S). 

b. Any finitely generated convex set is compact. 

c. S C ext conv S. 

d. Let /i, ...,f m ■ R™ — > R be linear functions and a\, ...,a m G R m . 
Denote by A the afiine space {x G R™ : fi(x) = a,, i = i,...,m}. 
Assume that C := conv SnA 7^ 0. Then C is a finitely generate convex 
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set such that ext C C conv m S. (Hint: Describe C by m+ 1 equations 
with #S variables as in part a. Use the fact that any homogenous 
system in m + 1 equations and I > m + 1 variables has a nontrivial 
solution.) 

c. Prove Theorem 4.14.2 for a finitely generated convex set C and a 
finite S. 

2. Let C be a convex set of dimension d with a nonempty ext C. Let 
C" = convext C. Show 

(a) ext C" = ext C. 

(b) Let d! = dim C". Then d' < d and the equality holds if and only 
if C = C. 

(c) Fj(C) C Fj +1 (C) and equality holds if and only if j > d' . 

3. Prove Proposition 2. 

4. Let C C V be a convex set. A function / : C — > R is called concave if 
— / is a convex function on C Assume the assumptions of Proposition 
4.14.3. Assume in addition /(C) C [0, oo] and g : C — > (0, oo) is 
concave. Then 

/(x) /(y) 

sup -^- = sup — -. 

xeconv S.g(x) y£5 3(y) 

If in addition S 1 is compact and f,g are continuous on convS* then 
one can replace sup by max. 

5. a. Let x,y £ R™. Show the implication x < y x ^ y. 

b. Let I? C and assume that /:£)—> R is strong Schur's order 
preserving. Show that / is nondecreasing on D. 

c. Let i e [2, n]nN and / be the following function on R": (xi, x„) T i— » 
Xi. Show that / is nondecreasing on IR n but not Schur's order pre- 
serving on R^. 

6. Let _D C be a convex Schur set. Let V be an n-dimensional IPS 
over F = R,C. Let S C S(V) be a finite set such that S C A _1 (D). 
Show that convS C A _1 (D). 

7. a. Let A e H„ and assume that tr A = 1. Show that X n (A) < ^ and 
equality holds if and only if A = \l n - 

b. Let E kl := ( ^MM+^MM )? , =i e Sp (R) for 1 < k < I < p be 
the symmetric matrices which have at most two nonzero equal entries 
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at the locations (k, I) and (l.k) which sum to 1. Let Q\, ...,Q^ P +i^ E 
S p (R) be defined as follows: 

Qi '■= En + E12, Q,2 '■= En — E12 + -E13, Q p := En — E ip + E23, 
Q2P-3 : = En — E 2 ( p -i) + E 2p , ...,Q^ := En — E( p _ 2 ) P + ^(p-i)pi 
Q(p) + i = En - E( p _i) p , Q(p\ + i = £ iu f or * = 2, 

Let S = {Qi, ...,Q^p+i-j} Show that ^I p e convS = conv^+ij jS and 
p-Ip conv/ P +i\_ 2 S. 

c. Let S C S p (K) be defined as in b. Show that tr A = 1 for each 
A e conv S. Hence 

max \ V (A) = \ V (-I V ) = - > max A„(_B). 

AGconvS ' ' p ' p BGconv/p+n S ' 

{ 2 ) 

d. Assume that n > p and let i?, := Qi © € S„(K), where Qi is 
defined in b, for i = 1, (^g 1 )- Let S = {Ri, ...,R^ P +i^}. Show that 

max XJA) = \J-I~ © 0) = - > max XJB). 

AGconvS P ' p y p -B£conv/ p+n S 1 

\ 2 ) 2 



8. a. Prove Theorem 4.14.7 repeating the arguments of Theorem 4.14.6. 
{Hint: Note that the condition (£?Xj,Xj) = o for two distinct or- 
thonormal vectors x^x.,- e V is equivalent to two real conditions, 
while the condition (i?Xj,Xi) = \i(A) is one real conditions for B e 

s(v).) 

b. Modify the example in Problem 7 to show that Theorem 4.14.7 is 
sharp. 

9. Let C = A+^IB e M n (C), A, B e M n (R). 

a. Show C € H„ if and only if A is symmetric and B is antisymmetric: 
B T = -B. 

b. Assume that C G H„ and let C G M2 n (R) be defined as in Problem 
6. Show that C e S 2n (R) and A 2i _i(C*) = X 2i (C) = A;(C) for i = 
l,...,n. 

c. Use the results of b to obtain a weaker version Theorem 4.14.7 
directly from Theorem 4.14.6. 

10. Prove Theorem 4.14.10. 
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11. Let F be a field and k £ Z+. A = (a„) e M n (F) is called a 2k + 1- 
diagonal matrix if a^- = if |i— j| > k. (1-diagonal are diagonal and 3- 
diagonal are called tridiagonal.) Then the entries auk+i), ■■■,Q>(n-k)n 
are called the /c-upper diagonal. 

a. Assume that A £ M n (F), n > k and A is 2k + 1-diagonal. Suppose 
furthermore that fc-upper diagonal of A does not have zero elements. 
Show that rank A > n — k. 

b. Suppose in addition to a. H„. Then upmul(A,p) < k for any 
p £ (n). 

12. Let V be an n-dimensional IPS over F = M, C. Let S C S(V) and p £ 
(n) . Define the weak p-upper multiplicity denoted by wupmul(conv S, p) 
as follows. It is the smallest positive integer m < p such that for 
any N = ("J 1 ) + 1 operators A\, ...,An G S there exists a sequence 
A jtk e S(V),j e (N),k eN, such that lim^^oo A ]M = Aj,j £ (N) 
and upmul(conv{A l fe , A Njk },p) < m for k e N. 

a. Show that wupmul(conv S, p) < upmul(conv S, p). 

b. Show that in Theorems 4.14.9 and 4.14.10 one can replace upmul(conv 
by wupmul(conv S, p) . 

13. a. Show that for any set S C D(n, M) and p S (n) wupmul(conv S, p) = 
1. (Hint: Use Problem 11.) 

b. Let Di = diag((5ii, 5i n ), i — 1, n. Let S := {Di, D n }. Show 
that for pe[2,ti]nZ 

max Ap(-D) = max A p (D) = > max \(D) = 0. 

DGconvS X)Gconv p _iS J? -D£conv p _2S 

c. Show that the variation of Theorem 4.14.9 as in Problem 12b for 
wupmul(convS,p) = 1 is sharp. 

d. Let A £ S„(R) be a tridiagonal matrix with nonzero elements on 
the first upper diagonal as in lib. Let i e R and define Di(t) = 
Di + tA, where Di is defined as in b, for i = l,...,n. Let S(t) = 
{D!(t),...,D n (t)}. Show 

dl. For t ^ upmul(convS(t),p) = 1 for p £ [2,n] n Z. 
d2. There exists e > such that for any \t\ < e 

max X V (A) = max A„(_B) > max A„(C). 

AGconvS(t) e Beconv p _iS(t) F CGconv p _ 2 S(t) 

Hence Theorem 4.14.9 is sharp in the case upmul(conv S, p) = 1. 
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14. a. Let S C H n be a set of 2fc+l-diagonal matrices. Assume that either 
each fc-uppcr diagonal of any i e S consists of positive elements, 
or each fc-upper diagonal of any A is fixed and consists of nonzero 
elements. Show that upmul(conv S, p) < k. 

b. Let S C H n be a set of 2k + 1-diagonal matrices. Show that 
wupmul(conv S, p) < k + 1. 

4.15 Multiplicity index of a subspace of S(V) 

Definition 4.15.1 Let V be a finite dimensional IPS over F = R, C. 
Let U be a nontrivial subspace o/S(V). Then the multiplicity index of U 
is defined 

mulind U := {maxj? g N : 3A g U\{o} such that X 1 (A) = ... = X P {A)}. 

Clearly for any nontrivial U mulind U G [i,dim V]. Also mulind U = 
dim V I E V. The aim of this section to prove the following theorem. 

Theorem 4.15.2 Let V be an IPS over F = R, C of dimension n > 3. 
For r£ [2,n - 1] let n{r) := (r ~ 1)(2 2 "~ r+2) if F = R and «(r) := (r - 
l)(2n - r + 2) i/F = C. Le£ U be a subspace of S(V). TTiera mulind U > r 
z/ dim U > n(r) and this result is sharp. 

Proof. Assume first that F = R. Assume dim U > n(r). Suppose to 
the contrary that index U = p < r. Let AeU such that Xi(A) = ... = 
Ap(^4) > X p+ i(A). Assume that 

Axi = Xi(A)xi, x, g V, (xj,Xj) = 5ij, i,j = i, ...,n. 

By representing S(V) is ^(R) with respect to the orthonormal basis x 1; x, 
we may assume that U is a subspace of S„(R) and A — diag(Ai(A), X n (A)) 

4.16 Analytic functions of hermitian matri- 
ces 

Denote by H„ the set of all n x n hermitian matrices. For A g H„ denote 
by spec A C R the spectrum of A. A(z) := A a + zAi,Aq,Ai g H„ is 
called a hermitian pencil. It is known that it is possible to rename the 
eigenvalues of A(z) as «i(z), . . . , a n (z) such that each aii(z) is analytic in 
some neighborhood Af of the real axis R. Furthermore the eigenprojection 
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Pi(z) on each aii(z) is analytic in z in M . (Note that if cti(z) has multiplicity 
rrii for all but a finite number of points on R, then P%(z) has rank m, on AT.) 
Furthermore, the corresponding eigenvectors x 1 (z), . . . , x„(z) can be chosen 
to be analytic in z £ Af, such that for x 1 (t), . . . ,x„(i) are orthonormal for 
t £ R. Sec [Kat80, II.6.1-II.6.2]. 

Let t £ R be fixed. An eigenvalue a(z) := oti(z) is called regular at t if 
the multiplicity of a(z) is fixed for |z — t\ < r for some r = r(t) > 0. So 

oo 

(4.16.1) a(,z) = ^djiz-ty, a 1 £R,j£Z+. 

Furthermore, given a normalized eigenvector ^4(t)x = a x ,x*x = i, 
there exists an analytic eigenvector x(z) corresponding to a(z) satisfying 
the conditions: 

oo 

(4.16.2) x(z) - Y,(z - t) j Xj , Xj e C n ,j £ Z+, 
A(z)x(2:) = a(z)x(z), x(s)*x(s) = l for s e E. 

Let S„ denote the space of all n x n real symmetric matrices. Suppose that 
A ,Ai £ S„. Then A(z) is called a symmetric pencil. In that case the 
projections induced by A(s), selon each at(s) must be a real orthogonal 
projection. Hence in the expansion (4.16.2) each Xj can be chosen to be a 
real vector. 

One can find the formulas for aj,Xj,j £ Z + in terms of Aq,Ai, in 
particular for a\, a 2 , in [Kat80, II. 2. 4]. In this note we give slightly different 
formulas for a±, a,2, az- 

Let B £ H„. Denote by B' £ H„ the Moore-Penrose inverse of B. That 
is B is uniquely characterized by the condition that B^B = BB^ is the 
projection on the subspace spanned by all eigenvectors of B corresponding 
to the nonzero eigenvalues of B. 

Theorem 4.16.1 Let A , A x £~H n , A(z) = A Q + zA x . Lett £R,A = 
A + tA\ and assume that the eigenvalue a £ spec A is simple for the 
pencil A{z) at z — t. Suppose that Ax a = a x , x*x n = i. Let (4-16.1- 
4-16.2) be the Taylor expansion of the eigenvalue a{z) £ spec A(z) and 
a local Taylor expanison of the corresponding eigenvector x(z) satisfying 
a(t) = oo, x(i) = x Q . Then 
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ai = x^AiXo, 

x 1 = (aj - A) t A 1 x , 

(4.16.3) a 2 = x*A lXl = x* A^{a I - A)U lXo , 

x 2 = (a J - Ao)^^! - a 1 7)x 1 - (-x*x 1 )x 

2 

a 3 = x*((A - a 1 /)(a I - A) t ) 2 A 1 x„, 

Proof. Without loss of generality we assume that t = 0, hence 
A = Aq. Next we consider first the case where Aq,Ai are real sym- 
metric. Furthermore, by replacing A ,Ai with Q T A Q, Q T AiQ, where 
Q £ M. nxn is an orthogonal matrix we may assume that A is a diagonal 
matrix diag(di, . . . , d„), where di — ao and dj ^ oo for i > 1. More- 
over, we can assume that x = (l, o, . . . , o) T . Note that (a I — A o y = 
diag(0, (ao — di) -1 , • • • , (»o — dn) 1 )- (We are not going to use explicitly 
these assumptions, but the reader can see more transparently our argu- 
ments using these assumptions.) 

Recall that we may assume that x(s) e R n . The orthogonality condition 
x(s) T x(s) = l yields that 

k 

(4.16.4) Y, x l x k-j = o, fceN. 

j=o 

The equality A(z)x(z) = a(z)x(z) yields 

fc 

(4.16.5) A x k + A x -K k _ x = ^djXk-j, fceN. 

Since a(s) is real for a real s we deduce that Oj G K. Consider the 
equality (4.16.5) for k = 1. Multiply it by x^ and use the equality x^A = 
a xj to deduce the well known equality a\ — x^A x D , which is the first 
equality of (4.16.3). The equality (4.16.5) for fc = 1 is equivalent to (ao/ — 
A )x 1 = A^x — OjXp. Hence x 1 is of the form 

(4.16.6) = (a I - A)\A ± x - a x x ) + fe lXo = (a I - A)^A x x + b ± x Q , 

for some b\. The orthogonality condition x] ) x 1 implies that b\ = 0. Hence 
the second equality of (4.16.3) holds. 

Multiply the equality (4.16.5) for k = 2 by x^ and use xj A = a x^, xjx^ = 
o to obtain a 2 = xjA 1 x 1 . This establishes the third equality of (4.16.3). 
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Rewrite the equality (4.16.5) for k — 2 as (ao-/ — A )x 2 = A 1 x 1 —a 1 x 1 —a 2 x . 
Hence 

(4.16.7) x 2 = (a I - A )^(A 1 - a 1 /)x 1 + 6 2 x . 

Multiply the above equality by to deduce that x^x 2 = b 2 . (4.16.4) for 
k = 2 yields b 2 = — ^xjx 1 . This establishes the fourth equality of (4.16.3). 
Multiply the equality (4.16.5) for k = 3 by x^ to deduce 

a 3 = x^ A ± x 2 - a 1 xjx 2 = xJ(A 1 - aj)x 2 . 

Observe next that from the first equality in (4.16.3) x^ (A 1 — a 1 I)x = o. 
This establishes the last equality of (4.16.3). 

We now show that the same formulas hold when Ao, A\ are hcrmitian. 
Observe that if we have a local Taylor expansion of x(z) then we can re- 
place x(z) by e^ z ^x.(z), where 4>(z) — Y^jLi ^ji 2 " i s locally analytic at 
z = t and each <f>j is purely imaginary. Now we repeat the proof of (4.16.3). 
The first formula of (4.16.3) holds as in the symmetric case. The equality 
(4.16.6) also holds. We now can only deduce the equality 3J6i — 0. Now 
choose <pi such that b\ = 0. Hence the second equality of (4.16.3) holds. 
Now deduce the third equality of (4.16.3). Next we deduce (4.16.7) and the 
equality Wd 2 = 0- Now use the corresponding choice of 4> 2 to obtain that 
b 2 = 0. Hence the fourth equality of (4.16.3) holds. Now deduce the last 
equality of (4.16.3). □ 

Note that for the real symmetric case the formulas of x 1; x 2 in (4.16.3) 
are global formulas. 

Theorem 4.16.2 Let A e H n,ii > 2. Assume that ao is a simple 
eigenvalue ofA , with the corresponding eigenvector Ax = a Q x ,x*x = i. 
Suppose furthermore that |A — ao| > r > for any other eigenvalue A 
of Aq. Let Ai e tl n ,A 1 7^ o, and denote by \\Ai\\ the l 2 norm of A\, 
i.e. the maximal absolute value of the eigenvalues of A x . Let a(z) be the 
eigenvalue of A(z) = A a + zA\, which is analytic in the neighborhood o/M 
and satisfying the condition a(0) = a . Let a\,a 2 will be given by (4-16.3), 
where A = A . Fix < c < 2 ||Ai|| ■ Then 

(4.16.8) \a{s) - (oq + a lS + a 2 s 2 )\ < ^^4144^2 f or aU 3 e ^ c l- 

{r-2c\\A 1 \\y 

Proof. Let Ai(s) > . . . > A„(s) be the eigenvalues of A(s), sel. Note 
that Ai(0), . . . , A„(0) are the eigenvalues of A . Assume that Aj(0) = ai. 
Let p(s) = min(A i _i(s)-Ai(s),A i (s)-A i+ i(s)), where A (s) = oo,A n+ i(s) = 
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— oo. Thus r < p(0). Let (3\ > . . . > (3 n be the eigenvalues of Ai. Then 
||Ai|| = max(|/3i|, |/3„|). Apply Lidskii's theorem to C = A(s) - A [Kat80, 
III.Thm 6.10] to deduce that 

\^(s)-Xj(0)\ < \ s \ j = l,...,n. 

Hence 

(4.16.9) p(s) > r(0) - 2^111^11 > for a € (-^fj ' 

In particular, A;(s) is a simple eigenvalue of A(s) in the above interval. 
Assume that s is in the interval given in (4.16.9). It is straightforward to 
show that 

(4.16.10) \\( a (a)I-A(a)n = -^r< ' 



p(a) ~ p(0)-2| S | \\AiW 

(One can assume that A(s) is a diagonal matrix.) 

Use the Taylor theorem with remainder to obtain the equality 

(4.16.11) ct(s) - (a + soi + s 2 a 2 ) = ^a (3) (t)s 3 for some t, \t\ < \s\. 
Use Theorem 4.16.1 to deduce that 

^(t) - MtydAi - «<(*)/)(<*(*)/ - ^(t))t) 2 A lXj (t), 

where Xj(s) is an eigenvector of A(s) of length one corresponding to a,(s). 
As a'(i) = Xj^MiX^i) we deduce that |a-(t)| < ||Ai||. Hence \\Ax - 
a-(t)/|| < 2||Ai||. Therefore 

|^a (3) WI < ||((A 1 -a' i (t)/)(a i (i)7-^))t) 2 A 1 || < ^jf 

Use the inequality (4.16.10) and the inequality r < p(0) to deduce the 
theorem. □ 



4.17 Eigenvalues of sum of hermitian matri- 
ces 

Put here 
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Lidskii's theorem which is equivalent to X(A) — X(B) -< \(A — B). 
Weyl's inequality. 

Kato's inequality [Kat80, II.5.Thm 6.11] If / : M — > R is convex then 

n n 

J2f(\(A)-\(B))<J2f(\(A-B). 

i=l i=l 

In particular use f(x) = \x\ p ,p > 1. 



Chapter 5 

Elements of Multilinear 
Algebra 

5.1 Tensor product of two free modules 

Let D be a domain. Recall that N is called a free finite dimensional module 
if N has a finite basis e 1; . . . ,e n , i.e. dim N = n. Then N' := Horn (N,D) 
is a free n-dimcnsional module. Furthermore we can identify Horn (N',D) 
with N. (See Problem 1.) 

Definition 5.1.1 Let M, N be two free finite dimensional modules over 
an integral domain D. Then the tensor product M (g N is identified with 
Horn (N',M). Moreover, for each m e M, n e N we identify m (g n e 
M ®d N with the linear transformation m (g n : N' — > M given by f i— » 
f(n)m /or any f gN'. 

Proposition 5.1.2 Let M,N be free modules over a domain D with 
bases [d 1 , . . . , d m ], [e 1 , . . . , e„] respectively. Then M ®d N is a free module 
with the basis dj (g) e^, i = l, . . . , m,j = l, . . . , n. Ln particular 

(5.1.1) dim M (g) N = dim M dim N. 

(See Problem 3.) For an abstract definition of M (gin N for any two 
D-modules see Problem 16. 

Intuitively, one views M ® N as a linear span of all elements of the form 
m®n, where m e M, n e N satisfying the following natural properties: 

• a(m ® n) = (am) <g n = m <g (an) for all a e D. 
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• (aim! + a 2 m 2 ) (g> n = a,(m, <g> n) + a 2 (m 2 <8> n) for all a\, a-i G D. 
(Linearity in the first variable.) 

• m <g> (airii + a 2 n 2 ) = a 1 (m <g> rii) + a 2 (u <g> n 2 ) for all a\, a 2 G D. 
(Linearity in the second variable.) 

The element m&n is called decomposable tensor, or decomposable element 
(vector), or rank one tensor. 

Proposition 5.1.3 Let M,N be free modules over a domain D with 
bases 

[di, . . . , d TO ], [d, . . . , e„] respectively. Then any r G M ®d N is given 6?/ 

i=m,j=n 

(5.1.2) r= ^ Oijdiigej, A = [aij] G D mxn . 

i=j=l 

Lei [ui, . . . , u m ], [vi, . . . , v n ] be another bases o/M, N respectively. Assume 
that t = Y%jL\ hjUi <g> Vj and tet S = [6y] G D mxn . T/ien S = PAQ T , 
where P and Q are the transition matrices from the bases [d 1; . . . , d m ] to 
[ui,...u m ] and [e l7 ...,e„] to [v 1 ,...,v n ]. 
([d 1 ,...,d m ] = [u 1 ,...u m ]P, [e l7 ...,e„] = [v„ . . . , v n ]Q.) 

See Problem 6. 

Definition 5.1.4 Let M,N be free finite dimensional modules over a 
domain D. Let r G M®jN be given by (5.1.2). The rank of t , denoted by 
rank r, is the rank of the representation matrix A, i.e. rank t = rank A. 
The tensor rank of t, denoted by Rank t, is the minimal k such that t = 
X^=i m / ® n ; for some m ; G M, n ( G N, I = l, . . . , k. 

rank r is independent of the choice of bases in M and N. (Problem 7.) 
Since M <E>o N has a basis consisting of decomposable tensors it follows that 

(5.1.3) Rank r < min(dim M,dim N) for any r G M ® D N. 
See Problem 8. 

Proposition 5.1.5 Let M, N be free finite dimensional modules over 
a domain D. Let r G M ® D N. Then rank r < Rank r. If D is a Bezout 
domain then rank r = Rank t 

Proof. Assume that M,N have bases as in Proposition 5.1.3. Sup- 
pose that (5.1.2) holds. Let r = X^=i m ' ® n '- Clearly, each m; ® n; = 
a »j,(di <S> e,-, where A; := [a»j,i]£j=i G D mx ™ is rank one matrix. 
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Then A = Y^d=\ Ai- It is straightforward to show that rank A < k. This 
shows that rank r < Rank r. 

Assume that D is BD. Let P e GL(m, D) such that PA = [%] e D mx ™ 
is a Hcrmite normal form of A. In particular, the first r := rank A 
rows of i? are nonzero rows, and all other rows of B are zero rows. Let 
[u 1; . . . , u TO ] := [d 1; . . . , d m ]P _1 be a basis in M. Proposition 5.1.3 yields 
that r = I]™ ; "i^i u i ® e j- Define n ; = = i, ...,r. Then 

t = XT=i u ( ® n '- Hence r > Rank r, which implies that rank r = Rank r. 

□ 



Proposition 5.1.6 Le£Mj,Nj be free finite dimensional modules over 
D. Le£ Tj : Mj — > Nj &e homomorphisms. Then there exists a unique 
homomorphism on T : M. x ® M 2 — » N t (g> N 2 s«c/i i/iat T(m 1 <g> m 2 ) = 
(Tinii) ® (T 2 m 2 ) for all m 1 e M. 1 ,m 2 e M 2 . 77ms homomorphism is 
denoted by T\®T 2 . 

Suppose furthermore that W, , W 2 are /ree /znzie dimensional ^-modules, 
and Pi : Nj — > Wj,i = 1,2 are homomorphisms. Then (P\® P 2 ){Ti®T 2 ) = 
(P 1 T 1 )®(P 2 T 2 ). 

See Problem 9. 

Since each homomorphism Tj : M j — » N j , i = 1,2 is represented by a 
matrix, one can reduce the definition of T\ <g> T 2 to the notion of tensor 
product of two matrices A Y £ D" lXmi , A 2 G D" 2Xm2 . This tensor product 
is called the Kronecker product. 

Definition 5.1.7 Let A = [a tf ]$2i G D mx ",S = [fryfjU G DPX9 - 
TTien A <g> S e D m P xn « is tfie following block matrix: 



(5.1.4) 



021-8 



ai 2 S 
a 22 B 



a m \B a mi2 B 



a\„B 
a 2n B 

OmnB 



In the rest of the section we discuss the symmetric and skew symmetric 
tensor products of M <g> M. 



Definition 5.1.8 Let M fc a free finite dimensional module over D. 
Denote M® 2 := M <g> M. T/ie submodule Sym 2 M C M® 2 , caZZed a 2- 
symmetric power o/M, is spanned by tensors of the form sym 2 (m,n) := 



232 



CHAPTER 5. ELEMENTS OF MULTILINEAR ALGEBRA 



m (g> n + n ® m for all m, n G M. sym 2 (m, n) = sym 2 (n, m) is called a 2- 
symmetric product o/m and n, or simply a symmetric product ^4ny vector 
t G Sym 2 M is a called a 2-symmetric tensor, or simply a symmetric tensor. 
The subspace /\ 2 M C M® 2 , ca^ed 2-exterior power of M, is spanned by 
all tensors of the form mAn :— m<g)n — n®m, for all m, n G M. mAn = 
— n A m is called the wedge product of m and n. ^4n?/ vector r G /\ 2 M is 
a called a 2-skew symmetric tensor, or simply a skew symmetric tensor. 

Since M® 2 can be identified with D mxm it follows that Sym 2 (M) and 
/\ 2 M can be identified with the submodules of symmetric and skew sym- 
metric matrices respectively. See Problem 12. Observe next that 2m (gin = 
sym 2 (m, n) + m A n. Assume that 2 is a unit in D. Then M® 2 = 
Sym 2 (M)©/\ 2 M. Hence any tensor r G M® 2 can be decomposed uniquely 
to a sum t = r s + T a where t s , r a G M® 2 are symmetric and skew symmetric 
tensors respectively. (See Problem 12.) 

Proposition 5.1.9 Let M, N be a finite dimensional module over D. 
Let T : Horn (M, N). Then 

2 2 

T <g> T : Sym 2 M -> Sym 2 N, r®T:/\M^/\N. 
See Problem 13. 

Definition 5.1.10 Le£ M, N 6e /ini£e dimensional modules over 'B. Le£ 
T : Horn (M, N). T/ien TAT G Horn (/\ 2 M,/\ 2 N) is de/med as i/ie 
restriction of T ® T to /\ 2 M. 

Proposition 5.1.11 Let M, N fc a finite dimensional module over D. 
Let T : Horn (M, N). T/ien 

1. Assume that [d 1; . . . ,d m ]. 

2. Assume that S : Horn (L,M). Show that ST A ST = (S A S)(T AT). 



Problems 

1. Let N be a free module with a basis [e 1; . . . , e„]. Show 

• N' := Horn (N,D) is a free module with a basis [f l7 ...,f„], 
where = z, j = i, . . . , n. 
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• Show that (N')' can be identified with N as follows. To each 
n G N associate the following functional n : N' — > D defined by 
n(f) = f(n) for each f G N'. Show that n is a linear functional 
on N' and any r G (N')' is equal to a unique n. 

2. . Let F be a field and V and n-dimensional subspace of V. Then 
V := Horn (V,F) is called the dual space of V. Show 

(a) (V)' can be identified with V. I.e. for each v G V let v : 
V — > F be the linear functional given by v(f ) = f (v). Then any 
ip G (V)' is of the form v for some v G V 

(b) ForlCV, F C V denote by X 1 - := {f G V : f (x) = o, Vx G 
X},F^ := {v G V : f(v) = o, Vf G F}. Then X^^F 1 - are 
subspaces of V, V respectively satisfying 

(X ± )' L = span (X), dim X 1 " = n — dim span (X), 
(F^ = span (F), dim F x = n - dim span (F). 

(c) Let Uj, . . . , Ufc be /c-subspaces of either V or V. Then 

(nt 1 u i ) ± = ^u^, Eu i )- L = n i=1 Ui L , 

i— l i— l 

(d) For each bases {v 1; v 2 , . . . , v„}, {f l7 . . . , f„} in V, V respectively 
there exists unique dual bases {g 1: g 2 , . . . , g n }, {u l7 . . . , u n } in 
V, V respectively such that gi(vj) = fj(uj) = i, j = i,...,n. 

(e) Let U C V, W C V two m-dimensional subspaces. TFAE 

i. un W x = {0}. 

ii. u- L nw = {o}. 

iii. There exists bases {u 1 ,...,u m },{f 1 ,...,f ro } in U,W re- 
spectively such that fj(uj) = Sij,i,j — l, . . . ,m. 

3. Show Proposition 5.1.2. 

4. Let U be the space of all polynomials in variable x of degree less than 
m: p(x) — Y1T=(^ a i xl w ^ n coefficients in F. Let V be the space of 
all polynomials in variable y of degree less than n: q(y) — X)J=o 
with coefficients in F. Then U ® V is identified with the vector 
space of all polynomials in two variables x,y of the form f(x,y) = 
Y^ii^j=o~ c %3 x% ]f with the coefficients in F. The decomposable ele- 
ments are p(x)q(y),p G U, q G V. (The tensor products of this kind 
are basic tools for solving PDE (partial differential equations), using 
separation of variables, i.e. Fourier series.) 
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5. Let M = D"\ N = D™. Show 

• M®N can be identified with the space ofmxn matrices D mx ". 
More precisely each A £ D mxn is viewed as a homomorphism 
A : D" -> D m , where D m is identified with M'. 

• The decomposable tensor m ® n is identified with mn T . (Note 
mn T is indeed rank one matrix.) 

6. Prove Proposition 5.1.3. 

7. Show that rank r defined in Definition 5.1.4 is independent of choices 
of bases in M and N. 

8. Let the assumptions of Proposition 5.1.3 holds. Show that the equal- 
ities 

m n n m 

i—1 j — l j — i i—i 

yield (5.1.3). 

9. Prove Proposition 5.1.6. 

10. Let the assumptions of Proposition 5.1.2 hold. Arrange the basis 
of M ®d N is the lexicographical order: d 1 ® e 1; . . . , dj <g> e„, d 2 ® 
e 1 , . . . , d 2 ® e„, . . . , d m ® e 15 . . . , d m ® e„. We denote this basis by 
[d 1; . . . ,d m ] <g> [e 1; . . . ,e„]. 

Let M;, Ni be free modules with the bases [d^j, . . . , d mi) j], [e ±i i, . . . , e nit i] 
for / = 1,2. Let TJ : M; — > Ni be a homomorphism represented by 
At £ D n ' xm; in the above bases for Z = 1,2. Show that 7i <g> T 2 
is represented by the matrices Ai ® Ai with respect to the bases 
[d lll; . . . , d mi!l ](g)[e ia , . . . ,e„ 1;1 ] and [d 1)2 , . . . , d m2i2 ](g>[e li2 , . . . ,e„ 2i2 ]. 

11. Let A £ D mx ™, B e DP X «. Show 

• If m = n and A is an upper triangular than A®B is block upper 
triangular. 

• If m = n,p = q and A and -B are upper triangular then A® B 
is upper triangular. 

• If A and B are diagonal matrices then A®B is a diagonal matrix. 
In particular I m ® I p — Imp- 

• Let C £ B lxm ,D £ D rxp . Then (C®D)(ixB) = (CL4)<g>(£>£). 

• A £ GL(m,D),B e GL(p,D) then A® B £ GL(mp, D) and 
(A® B)- 1 = A" 1 <g> B' 1 . 
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• rank A®B = rank A rank B. (Use the fact that over the quotient 
field F of D, A and B are equivalent to diagonal matrices.) 

• Let m = n,p = q. Show that det A ® B = det A det B. 

12. Let M be a free module with a basis [d l7 . . . , d m \. Identify M® 2 with 
D mxm . Show that Sym 2 M is identified with S m (D) C D mxm , the 
module ofmxm symmetric matrices: A T — A, and /\ 2 M is identified 
with AS(m, D), the module ofmxm skew symmetric matrices: A T = 
-A. 

Assume that 2 is a unit in D. Show the decomposition r e M® 2 
as sum of symmetric and skew symmetric tensor is equivalent to the 
following fact: Any matrix A e D mxm is of the form A = 2^(A + 
A T ) + 2 _1 (A — A T ), which is the unique decomposition to a sum of 
symmetric and skew symmetric matrices. 

13. • Prove Proposition 5.1.9. 

• Show that (Sym 2 M, Sym 2 N) and (/\ 2 M, /\ 2 N) are the only 
invariant pairs of submodules of T® 2 for all choices of T € 
Horn (M,N). 

14. Let M be a module over the domain D. Let A C M be a subset 
of M. Then span X is the set of all finite linear combinations of the 
elements from X. 

• Show that span X is a submodule of M. 

• span X is called the submodule generated by X. 

15. Let A be a nonempty set. For a given domain D denote by Mb (A) 
the free D-module generated by A. That is Mjj(A) has a set of 
elements e(x),x € A with the following properties: 

• For each finite nonempty subset Y C X, the set of vectors 
e {y)iD S Y are linearly independent. 

• Md(A) is generated by {e(x),x € X}. 

16. Let M, N be two modules over an integral domain D. Let P be the 
free module generated by M x N := {(m, n) : m E M, n £ N}. (See 
Problem 15.) Let QCP generated by the elements of the form 

e((am 1 +6m 2 , cn 1 +rfn 2 ))— ace((m 1 , nj)- ade((m 1; n 2 ))—bce((m 2 ,n 1 ) 

for all a, b, c, d € D and ra x , m 2 e M, n x , n 2 e N Then M ® D N := 
P/Q is called the tensor product of M and N over D. 

Show that if M, N are two free finite dimensional modules then the 
above definition of M <E>b N is isomorphic to Definition 5.1.1. 
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5.2 Tensor product of several free modules 

Definition 5.2.1 Let Mj be free finite dimensional modules over a do- 
main D for i = 1, . . . , k, where k > 2. Then M := ®^M, = M 1 ®M 2 ® 
. . . <£> M fc is i/ie tensor product space o/M 17 . . . , M fc is defined as follows. 
For k — 2 Mi <g M 2 is defined in Definition 5.1.1. For k > 3 <gf =1 Mi is 
defined recursively as ((givT^Mj) (g> Mfc. 

iVoie that from now on we suppress in our notation the dependence on 
D. When we need to emphasize D we use the notation ®d • • • ®d Mfc. 
M is spanned by the decomposable tensors 

®f =1 mj := nii <g> m 2 ® . . . ® m fe , nii e Mj,i = l, . . . , k, 

called also rank one tensors. One have the basic identity: 

a(m 1 ® m 2 <g> . . . <g> m fe ) = (anii) ® m 2 <g> . . . <g> m fe = 
nii ® (am 2 ) ® . . . ® m fe = . . . = ® m 2 ® . . . ® (am fe ) . 

Furthermore, the above decomposable tensor is multilinear in each variable. 
Clearly 

(5.2.1) (gjLjnij.^, ji = l, . . . , m^z = l, . . . ,k is a basis of ®i =1 

if m 1; j, . . . , m mi:i is a basis of Mj for « = l, . . . , k. 

Hence 

fe 

(5.2.2) dim M, = ]Jdim M t . 

Thus 

mi,m2,...,mfc 

(5.2.3) a= J2 a h32...j k ®*=i m^.i, for any a e ®? =1 Mi. 
Denote 

(5.2.4) D miX -' xmt :=(gtiD m % for fc e N and m; e N, i = 1, . . . , k. 

A e D miX - xmfc is given as .4 := [a^...^]™^;;;^^,, where a h ... jk G 
D, ji = 1, . . . , nii, i = 1, . . . , k. A is called a k — tensor. So 1-tensor is a 
vector and 2-tensor is a matrix. 

In particular <g>i =1 Mj is isomorphic to ]Q) m i x --- xm *; . Furthermore, af- 
ter choosing a basis of <8>f =1 Mj of the form (5.2.1) we correspond to each 
t e ®^ =1 Mj of the form (5.2.3) the tensor .A = [a^...jj™=;;:f^ =1 e 

prri! x ...xm k 
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Proposition 5.2.2 Let Mj, Nj, i = i,...,k be free finite dimensional 
modules over D. Let Ti : Mj — > Nj, i = l, . . . , k be homomorphisms. Then 
there exists a unique homomorphism on T : ®| =1 M, — > ®k =i Nj such that 
T(®i =1 mi) = (g)f =1 (Tjmj) for all nij e Mj,i = i,...,fc. T/iis homomor- 
phism is denoted by ® k =1 Ti. 

Suppose furthermore that Wj,z = i,...,k are free finite dimensional 
Hi-modules, and Pi : Nj — > Wj, i = l, . . . , k are homomorphisms. Then 
(®?=i^)(®i=ili) = 

See Problem 2. 

Since each homomorphism Tj : Mj — » Nj, i = i, . . . , fc is represented by 
a matrix, one can reduce the definition of (g)j =1 Tj to the notion of tensor 
product of k matrices. 

Definition 5.2.3 Let A t = [ay,,]™/!™ 4 € D m * x "V = l,...,fc. Then 
the Kronecker product A := g pi»i...m t xi>i...m j s ^ e ma t r ix with 

the entries 

k 

A = [a(i 1 ,...,i k )(j 1 ,...,j k )], a (h,...,i k )Uu-Jk) : =n a hji,i> 

i=i 

forli = l,...,mi, ji = l,...,rii, i = 1, . . . , fc. 

where the indices (h, . . . ,lk),h — 1, . . . ,m,i,i = l,...,k, and the indices 
(ji, ■ ■ ■ ,jk)> ji — 1, ■ ■ ■ ,rii,i = 1, . . . , k are arranged in the lexicographical 
order. 

It is straightforward to show that the above tensor product of matrices 
can be recursively defined by the Kronecker product of two matrices as de- 
fined in Definition 5.1.7. Sec Problem 3. The tensor products of k matrices 
have similar properties as in the case k — 2. See Problem 4. 

We now consider the /c-symmetric and fc-exterior products of a free finite 
dimensional module M. In view of the previous section we may assume that 
k > 3. Denote by Sk the permutation group of k elements of {1, . . . , ft}, 
i.e. Sk is the group of injections a : {1, . . . , fc} — > {1, . . . , k}. Recall that 
sgn(er) e {1,-1} is the sign of the permutation. That is sgn(cr) = 1 is a is 
an even permutation, and sgn(cr) = — 1 is a is an odd permutation. 

Definition 5.2.4 LetM. be a free finite dimensional module overD and 
2 < k e M. Denote M.® k := <g>f =1 M l; where Mj = M for i = 1, . . . , k. The 
submodule Sym k M C M®*, called a fc-symmetric power o/M, is spanned 
by tensors of the form 

(5.2.5) sym fc ( mi ,...,m fe ) := ^ ® k l=1 ra a(ih 
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for all m.i G M, i = 1, . . . , k. sym fc (m 1 , . . . , m^) is called a k-symmetric 
product of m 1; . . . , m k , or simply a symmetric product. Any tensor r G 
Sym k M is a called a k-symmetric tensor, or simply a symmetric tensor. 
The subspace /\ k M C M.® k , called k-exterior power of M, is spanned by 
all tensors of the form 

(5.2.6) Af =1 mj = rri! A . . . A m k := ^ sgn(cr) <g£ =1 m CT(i) 

/or «ZZ mi € M, i = i, . . . , k. A k =1 m.i is called k- wedge product ofm 1 , . . . , m^. 
Any vector t £ /\ M is a called a k-skew symmetric tensor, or simply a 
skew symmetric tensor. 

Proposition 5.2.5 Let M, N be free finite dimensional module over D. 
Let T : Horn (M,N). For k G N let T® k : M® k -> N® fc be T®. . .®T . 

k 

Then 

k k 

T® k : Sym k M -» Sym k N, T® fe : /\ M -» /\ N. 
See Problem 5. 

Definition 5.2.6 Le£ M, N oe /ree /inzie dimensional modules over D. 
Le£ T : Horn (M, N). T/ien A k T G Horn (A* M, A* N) is de/med as the 
restriction of T® k to /\ k M. 

Proposition 5.2.7 Let M, N be free finite dimensional modules over 
D. Let T : Horn (M, N). T/ien 

J. Let [d 1; . . . , d TO ], [e 15 . . . , e„] be bases in M, N respectively. Assume 
that T is represented by the matrix A = [ay] G D™ xm in t/iese 6ases. 
TTien A fe T represented in the bases 

A k =1 d Jz , l < j\ < . . . < j k < m, A k =1 e h , l <l x < ...< l k < n. 

by the matrix A k A G o( fc ) x ( fc ), where the entry ((Zi, . . . , l k ), (ji, . . . ,j k )) 
of A k A is the the k x k minor based of A on the (h, . . . , l k ) rows and 
C?i, • ■■■,3k) columns of A. 

2. Let L be a free finite dimensional module and assume that S : Horn (L, M). 
Then A k {TS) = (A k T)(A k S). 

See Problem 5. 
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Remark 5.2.8 In the classical matrix books as [Gan59] and [MaM64] 
the matrix A fc A is called the kth compound matrix or kth adjugate of A. 

Proposition 5.2.9 Let M 1 ,...,M k ,M := (g>* =1 M_i be free finite di- 
mensional modules over IS) with bases given in (5.2.1). Let [n 1;i , . . . ,n mui ] = 
[m 1; j, . . . , m mji j]J^ _1 , Ti = [tij.i] € GL(m i7 D) be another basis o/M t for 
i = 1, . . . ,TOj. Let a G M be given by (5.2.3). Then 

m 1 ,...,m k 

(5.2.7) a = ^2 b h-h ®i=i n i,,i^ where 

i 1 =-"=J( ! = l 

mi,...,m fc fc 

bi u ...,i k = (Y[ti ijii i)a jl ... jk forl i = l,...,m i ,i = l,...,k. 

TZiaZ is ~ [aj 1 ...j*],B : = [^-d </le?J # = (®£=i2i)A 

Definition 5.2.10 LeZ M l7 . . . , &e /ree /iraie dimensional modules 
over a domain D. LeZ r € ®- =1 Mj. T/ie tensor rank of t, denoted by 
Rank t, is the minimal R such that t = ®i=i m M / or some m;^ e 

M i; Z = i, . . . , R,i = i, . . . , k. 

We shall see that for k > 3 it is hard to determine the tensor rank a 
general Zc-tensor even in the case D = C. 

Let M be a D-module, and let M' = Hom(M, D) the dual module of 
M. For m e M, g e M' we denote (m,g) := g(m). Let 

m 1 ,...,m fe eM, gi, • • • ,gfc S M'. 

It is straightforward to show 

(5.2.8XH1, A...Am fc ,g 1 A...Ag fc ) = fc!<®? =1 ,gi A . . . A g fc ) = 

(mi,gi) • • • (m^gfe) 
fcldet : : 

. (mfe.gj ... (m fe ,g fe ) 

See Problem 8b. 

Assume that M is an m-dimensional free module over D, with the basis 
di, . . . , d m . Recall that M' is an m-dimensional free module with the dual 
basis ft , . . . , f m : 



(5.2.9) 



(di,fj) = fj(dj) = 5ij, i,j = i, . . . ,m. 
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Let M 1; . . . , Mfc, M := (8)*L 1 M i be free finite dimensional modules over 
D with bases given in (5.2.1). Let f^j, . . . , f mi ,i be the dual basis of for 
i = l,...,k. Then M' is isomorphic to ®^ =1 M^, where we assume that 
(5.2.10) 

k 

(§1^,0^6) := JJ<mi,gi), m, e M;, gi e M', i=i,...,k. 

i—i 

In particular, M' has the dual basis ®i = ifj i ,i,.7i = i, ■ • • , rnt, i = l, . . . , k. 

Assume that d lt . . . , d TO is a basis of M and f 1; . . . , f m is the dual basis 
of M'. Note that j\ k M' is a submodule of ( /\ fc M)'. See Problem 8c. Note 
that if Q C D then A*" M' = (A* M)'. 

Let N be a module over D of dimension n, as defined in Problem 1.6.1. 
Assume that M C N is a submodule of dimension m < n. For any k € N 
we view AM as a submodule of A N - A M : = i, A""M is a one 
dimensional module, while for k > m it is agreed that A fe M is a trivial 
subspace consisting of zero vector. (See Problem 10.) 

Let O C N be another submodule of N. Then (A P M) A(A 9 °) is a 
submodule of /\ p+q (M + O) of /\ p+q N, spanned by (m 1 A . . . m p ) A (o 1 A 
. . . A o q ), where m 1; . . . , m p £ U, o 1; . . . , o q € O for p, q > 1. If p = or 
g = then (/\ p M) A(A ? °) is e q ual to A 9 ° or A P M respectively. 

In in the next sections we need the following lemma 

Lemma 5.2.11 Let V be an n- dimensional vector space over F. As- 
sume that < Pi,P2i 1 < <7i> 52) k := Pi + ?i = P2 + <?2 < ™- Suppose that 
U 1 ,U 2 ,W 1 ,W 2 are subspaces of V swc/i f/iai dim U, = j>i,dim W^ > ^ 
/ori=l,2an(lU 1 nW 1 = U 2 nW 2 = {0}. T/ien 

Pl 9i Ps. 9a 

(5.2.H) (/\ uo A(A n (A u -) A(A * w 

i/ and only if the following condition holds. There exists a subspace V^V 
of dimension k at such that 

(5.2.12) U, C V 1; U 2 C V 1; V, C (U, + W x ), V, C (U 2 + W 2 ). 
Proof. Assume first that (5.2.11) holds. Note that 

Pl q-L k p = <J = k 

(A u >) A(A w >) ^ A( u * + w i)< (A u =) A(A w -) ^ A( u * + w -)- 

Let V 2 := (U, + WJ n (U 2 + W 2 ). Problem 10a yields that 



(5.2.13) (/\ uo /\(/\ wo n (/\ uo A(A Wx)cAv a . 
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The assumption (5.2.11) implies that dim V 2 > k. We now show that XJ 1 C 
V 2 . Assume to the contrary that dim fl V 2 = i < p x . Choose a basis 
v lv ..,v n such that in V such that v 1; . . . , v Pi and v 1; . . . , Vj, v Pi+1 , . . . , v r 
is a basis of Uj and V 2 respectively. Observe that the span of vectors 
V; A . . . A v Pl A Vj t ... Vi for p\ < i\ < . . . < i qi < n contain the subspace 

(/\ Pl Ui) A(A 91 Wi). On the other hand the subspace /\ k V 2 is has a basis 
formed by the exterior products of k vectors out of v 1; . . . , Vj, v Pi+1 , . . . , v r . 
Hence ( (A Pl U x ) /\ (/\ qi W 1 ) )n /\ k V 2 = {0}, which contradicts 
(5.2.11-5.2.13). So U, C V 2 . Similarly U 2 C V 2 . 

Next we claim that dim (XJ 1 + U 2 ) < k. Assume to the contrary that 
dim (Ui + U 2 ) = j > k. Let u 17 . . . , u„ is a basis of V, such that 

u u • • • j u p 1 an d Uj., . . . , u Pl+P2 _j, u Pl+1 , . . . , Uj 

are bases of \J 1 and U 2 respectively. Then (A Pl U t ) A (A 91 W i ) is 
spanned by ( n_Pl ) linearly independent vectors A . . . Ui k , where 1 < i\ < 
. . . < i k < nand {1, . . . , Pl } C {h, . . . ,i k }. Similarly, (A P2 U 2 ) A (A" = W 2 ) 
is spanned by (" 92 P2 ) linearly independent vectors u^^ A ...u Jfc , where 
1 < ji < ■ ■ ■ < jk < n and {1, . . . ,pi + p 2 - j,Pi + 1, • • • ,.?'} C {ii, . . . ,i fc }. 
Since j > k it follows that these two subset of vectors of the full set of the 
basis of A* V do not have any common vector, which contradicts (5.2.11). 
So dim (Ui + Ua) < k. Choose any k dimensional subspace of V 2 which 
contains XJ 1 + U 2 . 

Vice versa, suppose that V 1 is a fc-dimensional subspace of V satisfying 
(5.2.12). So A Vi is a one dimensional subspace which is contained in 
(A W Ui ) A (A 9, Wi ) for i = 1, 2. Hence (5.2.11) holds. 

□ 



Problems 

1. Let M 1; . . . , Mfe be free finite dimensional modules over D. Show that 
for any a e S k ^) k =1 M. a ^ is isomorphic to ®f =1 Mj. 

2. Prove Proposition 5.2.2. 

3. Show 

• Let A e D mxn and B e W xq . Then the definitions of A <g> B 
given by Definitions 5.1.7 and 5.2.3 coincide. 
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• Let the assumptions of Definition 5.2.3 hold. Assume that k > 
3. Then the recursive definition of ^ =1 A l := (0^^) ® A k 
coincides with the definition of ®f =1 Aj given in Definition 5.2.3. 

4. Let Ai e D m * x "»,i = 1, . . . , fc > 3. Show 

• ® k i= MAi) = (Ilti Oi) ®?=i ^i- 

• fe^f = 0^. 

• If 77jj = rii and ^ is an upper triangular for i = 1, . . . , k then 
®i =1 Ai is upper triangular. 

• If Ai, . . . , Ak are diagonal matrices then ®f =1 ^4j is a diagonal 
matrix. In particular ®i = il mi = Im 1 ...m k - 

• LetB, eD liXrai ,i= l,...,fc. Then(®{L 1 S j )(®f =1 i4 i )=®jL 1 (B i i4 i ). 

• A, e GL(m„D),i = i, . . . , k then e GL(m 1 . . .m fe ,D) 
and (^tiA,)- 1 = ®?=iA rl - 

• rank ®\ =l A = Hi=i ran k Ai. 

, n 3 fc =1 raj 

• For rrii = rii, i = 1, . . . , fc, det <g>f =1 = ]li=i( det A%) m * ■ 

5. Prove Proposition 5.2.7. 

6. (a) Let A e D mxi \5 e D" x p. Show that A fe AB = A fc A A fe S for 

any k £ [1, min(m, n, p)] flN. 

(b) Let A e D" xn . Then A fc A is upper triangular, lower triangu- 
lar, diagonal if A is upper triangular, lower triangular, diagonal 
respectively. 

(c) A*J B = J (I) . 

(d) If Ae GL(n,D)thcnA fe Ae GL(("),D)and(A fe A)- 1 = A fc A _1 . 

7. Let F be an algebraically closed field. Recall that over an algebraically 
closed A £ F™ XTl is similar to an upper triangular matrix. 

(a) Let Ai e F™> x "* for i = l,...,k. Show that there exists T t e 
GL(rij,F) such that (<g>£L 1 T i )(<g£ =1 A i )(®£ =1 :r i )- 1 is an un upper 
triangular matrix. Furthermore, let Ai.j, . . . , \ ni .i be the eigen- 
values of Ai, counted with their multiplicities. Then JliLi 
for jj = 1, . . . , rii, i — 1, . . . , k are the eigenvalues of <8>f =1 Ai 
counted with their multiplicities. 
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(b) Let A e f nxn an d assume that Ai, . . . , A„ are the eigenvalues 
of A counted with their multiplicities. Show that n»=i A?'. f° r 
1 < ji < . . . < jk < n are all the eigenvalues of A k A counted 
with their multiplicites. 

8. Let M be a finitely generated module over D. 

(a) Let m 1; . . . , m k € M. Show that for any a € S k m a (i) A ... A 
m <r(fe) = sgn(cr)m 1 A ... A m^. In particular, if m, = Oj-m., 
then m 1 A ... A = 0. 

(b) Prove the equality (5.2.8). 

(c) Assume that d x , . . . , d m is a basis of M and f x , . . . , f m is a dual 
basis of M'. Show that Af^ A . . . A fj fe , l < i 1 < ... < i k < m 
can be viewed as a basis for (/\ M)' for k e [1 , ml. 

9. Let M be an m-dimcnsional module over D as defined in Problem 
1.6.1. Show 

• f\ m M is a 1-dimensional module over D. 

• A fc V is a zero module over D for k > m. 

10. (a) Let V be an finite dimensional vector space over F and assume 

that U,W are subspaces of V. Show that /\ k U D f\ k W = 

A fe (un w). 

Hint: Choose a basis v 1; . . . , v„ in V satisfying the following 
property. v 15 . . . , v m and v It . . . , v;, v m+1 , . . . v m + P -l are bases 
for U and W respectively. Recall that Vj x A . . . A Vi k , l < i x < 
. . . < i k < n form a basis in /\ V. Observe next that bases of U 
and W arc of the form of exterior, (wedge), product of k vectors 
from v„ . . . ,v TO and v lr . . , v;,v m+1 , . . . v m+p _ ; respectively. 

(b) Assume that V is a an n-dimensional module of D b . Suppose 
furthermore that U, W are finitely generated submodules of V. 

Show that A fc u n A fe w = A fc (u n w). 

11. Let V be an n-dimensional vector space over F and U C V, W C V 
be m-dimcnsional subspaces. Show 

(a) Let {u,, . . . ,u m }, {f 15 . . . ,f m } be bases of U,W respectively. 
Then vanishing of the determinant det [(u i7 f,)]™ =1 is indepen- 
dent of the choice of bases in U, W. 

(b) Let F be a field of infinite characteristic. TFAE 



244 



CHAPTER 5. ELEMENTS OF MULTILINEAR ALGEBRA 



i. dim U- 1 n W > o. 

ii. dim U n W 1 - > o. 

iii. A^'ucfA^w) 1 . 

iv. A^wcfA^'u) 1 . 

v. For any bases {u 1; . . . , u m }, {f l7 . . . , f m } of U,W respec- 
tively (Uj A ... A u TO , f ! A . . . A f m ) = o. 

Hint: If dim U 1 nW = ousc Problem 2(e). If dim U i nW>o 
choose at least one vector of a basis in W to be in U 1 - n W and 
use (5.2.8). 

5.3 Sparse bases of subspaces 

Definition 5.3.1 1. ForO^xe F™ denote span (x)* := span (x)\{0}. 

2. The support of x = (x 1; . . . ,x„) T € F" is defined as supp (x) = {i£ 
{i,...,n} : o}. 

3. For a nonzero subspace U C F™, a nonzero vector x € U is called 
elementary if for every O^yeU the condition supp (y) C supp (x) 
implies supp (y) = supp (x). span (x)* is called an elementary class, 
in U, if x G U is elementary. 

4- Denote by £(U) the union of all elementary classes in U. 

5. A basis in {u 1; . . . , u m } in U is called sparse if u t , . . . , u m are ele- 
mentary. 

Proposition 5.3.2 Let U be a a subspace of F™ of dimension m € 
[l,n]. T/ien 

1. xeU is elementary if and only if for each O^ygU the condition 
supp (x) C supp (x) implies that y € span (x)*. 

2. S{\5) consists of a finite number of elementary classes. 

3. span (£ (U)) = U. 

^. For eac/i subset I of {1, ... ,n} of cardinality m—1 there exists an el- 
ementary xeU such that supp (x) c := {i, . . . , n}\supp (x) contains 
I. 

See Problem 1 for proof. 

Definition 5.3.3 Let F be afield ofO characteristic. 
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1. A = (aij) e¥ kxn is called generic if all the entries of A are alge- 
braically independent over Q, i.e. there is no nontrivial polynomial p 
in kn variable with integer coefficients such that p(an, . . . , a^n) = 0. 

2. A is called nondegenerate if all min(fc, n) minors of A are nonzero. 

3. An 1 < m-dimensional subspace U C F" is called nondegenerate if 
for J C {1, . . . ,n} of cardinality n — m + 1 there exists a unique 
elementary set span x* such that J = supp (x). 

Lemma 5.3.4 Let A E ¥ kxn , 1 < k < n be of rank k. TFAE: 

1. A is nondegenerate. 

2. The row space of A, (viewed as a column space of A T ), is nondegen- 
erate. 

3. The null space of A is nondegenerate. 

Proof. Consider first the column space of A T denoted by U C F™. 
Recall that any vector in U is of the form x = A T y for some y € ¥ k . Let 
I C {1,. . . ,n} be a set of cardinality k - 1 . Let B = (A T )[I 1 :} e F fe - lxfe be 
submatrix of A T with the rows indexed by the set /. The condition that 
supp (x) C I c is equivalent to the condition By = 0. Since rank B < k — 1 
there exists 0/xeU such that supp (x) C I c . Let d be defined as in 
Problem 3. 

Assume that rank B < k — 1. Then d = 0, sec Problem 3(b). Further- 
more, it is straightforward to show that for each each j <G I c there exists a 
nonzero x G U such that supp (x) C (I U {j}) c . So det A[;,IU {j}} = 
and A is not degenerate. 

Suppose that rank B = k — 1, i.e. d^O. Then any nonzero x e 
U,supp (x) C I c is in span (A T d)*. Let j e I c . Expand det A[:,IU {j} 
by the column j to deduce that and (A T d)j = ±det A[:,I U {j}]. Thus 
supp (x) = I c if and only det A[:,I U {j}] ^ for each j e I c . These 
arguments show the equivalence of 1 and 2. 

The equivalence of 1 and 3 are shown in a similar way and are discussed 
in Problem 4. 

□ 

Definition 5.3.5 Let J = {Ji, . . . , J t } be t subsets of (n) each of car- 
dinality m — 1. Then J satisfies the m-intersection property provided that 



(5.3.1) 



# n ieP J t <m-#P for all 9^ P C (t). 
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It is known that given a set J of t satisfying the above assumptions, 
one can check effectively, i.e. in polynomial time, if J satisfies the m- 
intersection property. See Problems 5-7. 

The aim of this section to prove the following theorem. 

Theorem 5.3.6 Let F be a field of characteristic and assume that 
A 6 f kxn i s generic over Q. 

1. Let T = . . . , I s } denote the collection of s < k subsets of (n) each 
of cardinality n — k + 1 . Then the elementary vectors x(/ 1 ),..., x(7 s ) 
in the row space of A with supports I\ , . . . , I s are linearly independent 
if and only if I' := {If, . . . , If), consisting of the complements of the 
supports, have k intersection property. 

2. Let J = { Ji, . . . , J t } denote the collection of t < n — k subsets of (n) 
each of cardinality k+1. Then the elementary vectors y( J ± ), . . . , y( Jt) 
in the null space of A with supports J\, . . . ,J t are linearly independent 
if and only if J' := {Jf, . . . , J t c } ; consisting of the complements of the 
supports, have n — k intersection property. 

The proof of this theorem needs a number of auxiliary results. 

Lemma 5.3.7 Let A e V kxn be nondegenerate. 

1. Letl= {7i,...,7 s } denote the collection of s < k subsets of (n) each 
of cardinality n — k + 1. Then the elementary vectors x(I 1 ),..., x(7 s ) 
in the row space of A with supports I\ , . . . , I s are linearly independent 
if and only if the kxs submatrix of /\ k ~ 1 A determined by its columns 
indexed by 7f, ... ,7° has rank s. 

2. Let b 1; . . . ,b„_fc e R™ be a basis in the null space of A and denote 
by B T G jpnx(ri-fe) ma f r j iX w hose columns are h ± , . . . , b„_fe. Let 
J = { Ji, . . . , Jt} denote the collection oft < n — k subsets of (n) each 
of cardinality k+1. Then the elementary vectors y( Ji), • • • , y(<7t) in 
the null space of A with supports J\, . . . , J t are linearly independent 
if and only if the (n — k — 1) x t matrix /\ n ~ k ~ 1 B determined by its 
columns indexed by J°, . . . ,J£ has rank t. 

See Problems 8-9 for the proof of the lemma. 

Corollary 5.3.8 et A e ¥ kxn be nondegenerate. 

1. Let I = {7i, . . . , 7^} denote the collection of k subsets of (n) each of 
cardinality n — k + 1 . Then the elementary vectors x(7 \), ... ,x(7 s ) in 
the row space of A with supports I\,...,I S not linearly independent if 
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and only if the determinant of the full row k x k submatrix of A k 1 X 
determined by its columns indexed by if, . . . ,1%. is identically zero for 
any X G ¥ kxn 

2. Let b 1; . . . ,b„_fe G W 1 be a basis in the null space of A and denote by 
B T G f nx ( n - k ) the matrix whose columns are b lv .., b„_fe. Let J = 
{Ji, . . . , Jt} denote the collection of t < n — k subsets of (n) each of 
cardinality k+1. Then the elementary vectors y(Ji), . . . ,y(J n -k) in 
the null space of A with supports Ji, . . . , J„_fc are linearly independent 
if and only if the determinant of the full row (n—k—1) x (n—k—1) sub- 
matrix ^ n ~ k ~ 1 Y determined by its columns indexed by J°,... , J„_ k 
is identically zero for any Y G f( n - k ) xn . 

(One may use Problem 10 to show part 2 of the above Corollary.) 

Definition 5.3.9 Let V be an n- dimensional vector space over F. Let 
U 1; . . . , U( C V be t subspaces of dimension m — 1.. Then {U 1; . . . ,U t } 

satisfies the dimension m-intersection property provided that 

(5.3.2) dim O ieP U; < m - #P for all ^ P C (t). 

Theorem 5.3.6 follows from the following theorem. 

Theorem 5.3.10 Let V be an n-dimensional vector space over a field 
F of characteristic and n > 2. Let 2 < m G (n) and assume that 
U lt . . . , U m G Gr^-^V). iet W m (U 1 , . . . , U m ) C Gr m (V) be the va- 
riety of all subspaces X G Gr m (V) such that the one dimensional sub- 
space Y := /\ m (/\ m_1 X) C ® m ( m_1 )V is orthogonal on the subspace 

w := (A ro " 1 u 1 )A(A ro " 1 u 2 )A---A(A ro " 1 u ro ) c q^-^v f di- 

mension one at most. Then W m {Ui, U m ) is a strict subvariety of 
Gr m (V) if and only if U 1; . . . , U m satisfy the dimension m-intersection 
property. 

Proof. Since each IL is to— 1 dimensional we assume that A™" U, = 
span (wj) for some Wj G A Uj for i = 1, . . . , to. Then W = span A 
. . . A w m ). Choose a basis x 1; . . . , x m in X. Let be the wedge product of 
to — 1 vectors from {x 1; . . . , x m }\{xi} for i = 1, . . . , to. Then y 1; . . . , y m 
are linearly independent and Y = span (y 1 A . . . A y„). The condition that 
Y _L W, i.e. Y i nW is a nontrivial subspace, is equivalent to the condition 

(5.3.3) ( yi A ... A y TO , w, A . . . A w m ) = mldet ((y l ,w i ))™ =1 = o. 
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See Problem 5.2.11. Since F has characteristic, the condition (5.3.3) is 
equivalent to the vanishing of the determinant in the above formula. We 
will use the formula (5.2.8) for each (y^Wj). 

Assume first that U lr .., U TO do not satisfy the dimension intersection 
property. By interchanging the order of U 1; . . . , U m if necessary, we may 
assume that there exists 2 < p < m such that Z :— n^_, U 7 - has dimension 
to — p + 1 at least. Let Z x C Z be a subspace of dimension m — p + 1. 
Then dimXnZ} >m-(m-p+i)=p-i. Let F C X n Z| be a 
subspace of dimension p — 1 . Assume that basis of X such 

that x 1; . . . , x p _! is a basis of F. So Xj C F for i = p, . . . , m. Hence 

XifKJf D XjHZ- 1 D X 4 nZ^ D FnZj; ^ {0} for i = p, . . . , to, j = 1, . . . ,p. 

Thus (y»,Wj) = for i = p,...,m, j — I,..., p. See Problem 5.2.11. 
Hence any p x p submatrix [(y», Wj)]^ =1 , with the set of columns (p), 
must have a a zero row. Expand det [(yj, Wj)]™- =1 by the columns (p) to 
deduce that this determinant is zero. Hence W Tn (U 1 , . . . , U m ) = Gr m (V). 

We now show by induction on m that if Uj, . . . , U m G Gv m - 1 (V') sat- 
isfy the dimension m- intersection property then there exists X £ Gr m (V) 

such that dim Y 1 - n W = o, for each n = m, m + 1, Assume that 

m = 2. As dim (Ui D U 2 ) = o we deduce that dim (\J 1 + U 2 ) = 2. Let 
Uj = span (ui),i — 1,2. Then {u x ,u 2 } is a basis in Z = span (u 1; u 2 ). 
Hence Z 1 - is a subspace of V of dimension n — 2. Thus there exists a 
subspace X e Gr 2 (V) such that dim XnZ 1 =0. Note that /\ m ^ X = 
X, /\ m 1 Uj = Uj,i = 1,2. Let x 15 x 2 be a basis in X. The negation of 
the condition (5.3.3) is equivalent to (x 1 A x 2 , u x A u 2 ) 7^ 0. Use Problems 
5.1.2(e) and 5.2.11 to deduce this negation. 

Assume the induction hypothesis that for 2 < I < n and any I dimen- 
sional subspaces Ui, . . . ,U; C V satisfying the ^-dimensional intersection 
property there exists X £ Gr/(V) such that dim Y^nW = 0. Let m = l+l 
and assume that U 1; . . . , U m satisfy the m-dimensional intersection prop- 
erty. Let ?:={FC (m-1) : dim n ieP U, = m-#P}. Note that {i} £ V 
for each i £ (to— 1). The m-intersection property yields that U m n(nj e pUj) 
is a strict subspace of [HigpUi for each P £ V . I.e. HjgpUi £ U TO for 
each P £ V. Equivalently (n^pUi) 1 2 U m- Problem 12(d) yields that 
U™\ Upgp (n.epU,) 1 56 0. Let x m e U^\ U Pe p (n.gpU,) 1 . Define 
U, := Uj n {xm} 1 ,! = 1, . . . , I. For i £ (I) we have that {i} £ V, hence 
x m ^ U^. Thus Uj £ Gri_i(V),i = 1, . . . ,1. We claim that Ui, U ; 
satisfy the ^-dimensional intersection property. 

Assume to the contrary that the Z-dimensional intersection property is 
violated. By renaming the indices in (I) we may assume that there is 2 < 
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k £ (l) such that dim r\ ie ^) Uj > / — k = m — k — 1. Since Uj C U i; « e (/) 
we deduce that dim n ie (fc)Uj > m—k—i. The assumption that U lv .., U m 
satisfy the m-dimcnsional intersection property yields dim n ie /M Uj = m — 
fc, i.e. (fc) e "P. Since x m ^ (n ie (fc)Ui)- L we deduce that dim (ru e <fc)Uj) n 
{xm} 1 - = dim (~li e /M Uj = m— fc— l, contradicting our assumption. Hence 
Ui, . . . ,U( satisfy the ^-dimensional intersection property. 

Let v 15 . . . , v n _!, x m be a basis in V. Let f 1; . . . , f„ be the dual basis 
in V. (See Problem 5.1.2(d).) Note that U, C span (f l7 . . . , f^J. Let 
Vj = span (vi, . . . , v„_i). Then we can identify span (f 1; . . . , f n -i) with 
V^. The induction hypothesis yields the existence of X £ Gr^Vi) such that 
dim Y i flW = 0. Assume that X is the columns space of the matrix X = 
[xij] £ W nxl . The existence of the above X is equivalent to the statement 
that the polynomial n (in, x n {), defined in as in the Problem 15, 

is not identically zero. Recall that U m £ span (f 1; . . . , f n -i)- Problem 13 
yields the existence of a nontrivial polynomial pu(xn, . . . ,x n i) such that 
X £ Gr^Vi), equal to the column space of X = [xij] £¥ nxl , satisfies the 
condition dim X n = o Pu{xn, • • • , x n i) ^ o. As f>u m Pui, ...,u, 

is a nonzero polynomial we deduce the existence of X £ Gr(V 1 ) such that 
dimXnU^o and X £ W m (U 1 , . . . , U,). 

Assume that x l7 . . . ,x m - 1 is a basis of X. Let X := span (x l7 . . . ,x TO ). 
We claim that X ^ W m (U 1 , . . . , U m ). Let Xj be the m — 1 dimen- 
sional subspace spanned by {x l7 . . . , x TO }\{xj} for i = l,...,m. Then 
/\ m_1 Xj = span(yi),i = i,...,m and /\ m_1 X = span (y l7 . . . , y m ). 
Let /\ m 1 Uj = span (wj),i = i,...,m. Note that x m £ Xj n for 
i = 1, . . . , m — 1 . Problem 13 yields that (y, , w m ) = o for i = 1, . . . , m — 1. 
Hence det [(y 4 , w 3 )]™ =1 = (y m ,w m )det [(yj, Wj)]™^. Since X m = X we 
obtain that dim X m n = o. Hence (y m ,w m ) 7^ 0. It is left to show 
that det [(y», Wj)]^!.^ 7^ 0. Let Xj C Xj be the subspace of dimension 
/ — 1 = m — 2 spanned by {x 1; . . . ,x m _ 1 }\{x i } for i = 1, . . . , m — 1. Note 
that Xj C X. So A^ 1 Xj = span (y;) and we can assume that yj = yj Ax m 
for i = 1, ...,m — 1. Recall that Uj = {xm}- 1 n Uj. As dim Uj = 
dim Uj — 1 we deduce that there exists Uj £ Uj such that (x m ,Uj) = 1 
for i = 1, . . . ,m — 1. So Uj = Uj ® span (iij). Let /\' _1 Uj = span (wj). 
We can assume that Wj = w, A Uj for i = 1, . . . , m — 1. Problem 14 
yields that (yj Ax m ,Wj A Uj) = l(yi,Wj) for i,j = l,...,m— 1. Hence 
det [(y^wj)]™^ = r-Met [(yj, w,-)]^. Since X £ W ro (U 1; . . . , U,) 
we deduce that det [(yj, w^]^ 7^ 0, i.e. X £ W m (U 1 , . . . , U m ). □ 

Lemma 5.3.11 Let J\, . . . , J t be t < m < n subsets of (n) each of car- 
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dinality m—1. Assume that Ji, . . . , Jt satisfy the m-intersection property. 
Then there exists m — t subsets Jt+i, . . . ,J m of (n) of cardinality m such 
that the sets J 1; . . . , J m satisfy the m-intersection property. 

Proof. It suffices to show that there is a subset J t+ i C (n) of 
cardinality m—1 such that J\, . . . , Jt+i that satisfies the m-intersection 
property. If t = 1 then choose J 2 ^ J\. Assume that t > 2. Let 
P={Pc(i):# n ieP Ji = m- #P}. Note that {i} e V for i <= (t). 

Let P,Q e P and assume that PC\Q^$. We claim that P U Q e V. 
Let X := n ie pJi,Y := C\, eQ J r Then #A = m - #P, #F = m - #Q. 
Furthermore #(XnF) = #X + #F - #(X U Y). Observe next X U 
Y C C^kePnQJk- Hence the m-intersection property of Ji,...,J t yields 
#(X U7)<m- #(P H Q). Combine the m-intersection property with all 
the above facts to deduce 

m - #(P U Q) > # n iePuQ J, = #(X n F) = m - #P + m - #Q - #(A U F) 

m - #P + m - #Q - (m - #(P n Q)) = m - #(P U 

It follows that there exists a partition {Pi, . .. ,P{\ of (i) into I sets that 
equality in (5.3.1) holds for each P i; and each P C (i) satisfying equality 
in (5.3.1) is a subset of some Pi. 

As # HigPj Ji = m — #Pi > m — i > 1, we let a; e rijgPj Ji. Choose 
Jt+i be any subset of cardinality m—1 such that Jt+\ H (HigpjJj) = 
(HieP! Jj)\{x}. Since #Jt+i = m — 1 it follows that J t+ i contains exactly 
#Pi elements not in n^Pj Jj. 

We now show that J\, . . . , Jt+i satisfy the m-intersection property. Let 
Q C (t) and P := Qu{t + l}. If Q V then # n ieP J < m - #Q - 1 = 
m — #P. Assume that Q G P. To show (5.3.1) we need to show that 
C^ieQJi £ Jt+i- Suppose first that Q C P x . Then .t e CiieQJi and x ^ Jt+i- 
Assume that Q C Pj, j > 1. So Pi n Q = and Pi U Q £ V. Hence 

Q ■= M^eP 1 J l ) n (rijgQ Ji)) = # Hfee^uQ J fc < m - (#P X + #0) - 1. 

Thus #((n je Qj i )\(n i£ p 1 Ji)) = m-#Q-q> #Pi + 1. We showed above 
that #(J t \(n iePl Ji)) = #Pi. Therefore n ieQ J £ J t+ i. □ 

Proof of Theorem 5.3.6. 

1. Suppose first that Ji := If , . . . , J s :— 1% do not satisfy the intersection k 
intersection property. Let P C (s) for which q := #(Hi e pJi) > k — #P+1. 
Note that #P > 2. We can assume that P = (p) for some 2 < p < s. 
We claim that := A fe ~M[; , J,],i = 1, . . . ,p are linearly dependent. Let 
J = nf =1 Jj. By renaming the indices if necessary we may assume that 
J = (q). Suppose first that the columns i = 1, . . . , q are linearly dependent. 
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Hence any k — 1 columns in Ji are linearly dependent for i = 1 , . . . , p. Thus 
y, = for i = 1, . . . ,p and y 1; . . . , y p are linearly dependent. 

Assume now that the columns in i = 1, . . . , g are linearly independent. 
Let C € F fcxfe be an invertible matrix. Then A C is also invertible. Thus 
y 17 . . . , y p are linearly dependent if and only if (A k ~ 1 C)y ± , . . . , (A fe_1 C)y p 
are linearly dependent. Thus we may replace A by A\ := CA. Choose C 
I X 1 

such that A\ = OF' w ^ erc ^ e ^ k ~ 9 ^ XQ is the zero matrix and 

p g F(fe-9)x(n-9). 

Consider a fc — 1 minor ^4i[{i} c ,if] for some K C (n) of cardinality 
k — 1 containing set J. Expanding this minor by the first g columns 
we deduce that it is equal to zero, unless, unless i = q + 1, . .. ,k. Let 
J- := Ji\J,i = l,...,p. Observe next that the submatrix of A 
based on the rows {q + 1} C , . . . , {k} c and columns J\, . . . , J p is equal to 
the matrix A k ~ q ~ 1 F[; , {J[, . . . , J' p }]. Hence rank A^ 1 Ai[; , {J x , . . . , J p }] = 
rank A k_q_1 F[; , {J' 1; . . . , J' p }]. Since q > k -p+ 1 it follows that F has at 
most fc — (fc — p+l) = p—l rows, which implies that A k ~ q ~ 1 F has at most 
p- \ rows. Hence rank A k_q_1 F[; , {J' l5 . . . , J^}] < rank A^i"^ < p-1. 
Lemma 5.3.7 implies that x( I 1 ),..., x(/ p ) are linearly dependent, which 
yield that x(/ 1 ), . . . , x(/ s ) are linearly dependent. 

Assume now that J\ := If,...,J s := 1% satisfy the intersection k 
intersection property. By Lemma 5.3.11 we can extend these s sets to 
fc subsets Ji , . . . , Jk C (n) of cardinality fc — 1 which satisfy the inter- 
section fc intersection property. Let V := F" and identify V := F", 
where (v, f) = f T v. Let {f l7 ...,f„} be the standard basis in F™. Let 
Let Uj = ©jg^span (fj), j = i,...,fc. Then U 1; ...,Ufc have the fc- 
dimcnsional intersection property. (See Problem 16). Theorem 5.3.10 yields 
that there exists a subspace X e Grfe(V) such that X ^ Wfe(U 1 , . . . , Ufe). 
Assume that X is the column space of B T e jpnxfe Assume that the 
columns of B T arc b 1; ...,bfc. As in the proof of Theorem 5.3.10 let 
Yi = Aj e (fe)\{i}bj, Wj = A jeJi f j,i = i,...,fc. Note that y, is i-th col- 
umn of A fe_1 i? T . Furthermore (yj,Wj) = A B[{i} c , Jj]. The choice of 
B is equivalent to the condition det [(yi,^j)] k j =1 ^ o. This is equivalent 
to the condition that the minor of fc x fc submatrix of A k ~ x B based on 
the columns Ji, . . . , Jk is not equal to zero. Since A is generic, the cor- 
responding minor of A h ~ 1 A ^ 0. (Otherwise the entries of A will satisfy 
some nontrivial polynomial equations with integer coefficients.) Hence the 
fc columns of A fe_1 j4 corresponding to Ji, . . . , Jk are linearly independent. 
In particular the s columns of A k ~ 1 A corresponding to Ji, . . . , J s are lin- 
early independent. Lemma 5.3.7 implies that x(/ 1 ), . . . ,x(/ s ) are linearly 
independent. 
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2. For a generic A let B e F("~ fe ' x " such that the columns of B T span 
the null space of A. So AB T = and rank B = n — k. According to 
Problem 4(e) B is generic. Note that for any J C (n) of cardinality k + 1 
x(J,B) = y(J,A). 

Assume that Jf , . . . , J t c do not satisfy the the n—k intersection property. 
The above arguments and 1 imply that y( J 15 A), . . . , y( J t , A) are linearly 
dependent. 

Suppose now that Jf , . . . , J t c satisfy the the n — k intersection property. 
Extend this set to the set J\, . . . , J n -k, each set of cardinality k + 1, such 
that J{ , . . . , satisfy the n — k intersection property. Let B e f( n - k ) xn 
be generic. 1 implies the n — k vectors x(J 1; B), x(J n _fc, B) in the row 
space of B are linearly independent. Let A <G F fex ™ such that the columns 
of A T span the null space of B. So rank A = k and BA T = 0. According 
to Problem 4(e) A is generic. Hence it follows that x( Jj, £?) = y( J,, A), i = 
l, . . . , n — k are linearly independent vectors. Problems 3- 4 yield that we 
can express the coordinate of each vector elementary vector in the null 
space of A in terms of corresponding k x k minors of A. Form the matrix 
C = [y(J 1 ,A) . . . y(J n _ k ,A)] e F" x ("- fc ). Since rank C = n - k it follows 
that some (n — k) minor of C is different form zero. Hence for any generic 
A the corresponding minor of C is different from zero. I.e. the vectors 
y(Ji, A), . . . , y(J„_fc, A) are always linearly independent for a generic A. 
In particular, the vectors y( J x , A), . . . , y( J t , A) are linearly independent. 

□ 



Problems 

1. Prove Proposition 5.3.2. 

2. Let A e F fcx ™ be generic. Show 

(a) All entries of A are nonzero. 

(b) A is nondegenerate. 

3. Let D be a domain and B e D < - k -^ xk , where k > 1. Let B, L e 
0(' s_1 ) x ( fe_1 ) be the matrix obtained by deleting the column i for 
i = 1, . . . , k. Denote d = (d l7 —d 2 , . . . , (— i) k ~ 1 dk) T ■ Show 

(a) d = if and only if rank B < k — 1 . 

(b) Bd = 0. 
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(c) Assume that x £ ker B. If rank B = k — 1 then x = fed for some 
b in the division field of D. 

4. Let A £ F fex ™, 1 < k < n. Assume that 1 < rank A = 1 < k. Show 

(a) For any I C {1, . . . , n} of cardinality n — I — 1 there exist ^ 
x £ nul A such that supp (x) C I c . 

(b) Let 7 C {1, ... ,n} be of cardinality n — k — 1 and denote B := 
A[:,I C ] £ Rfex(fe+i). Then dim {x £ nul A : supp (x) C I c } = i 
if and only if rank B = k. 

(c) Let 7 C {1, . . . , n} be of cardinality n — k — 1 and denote B := 
A[:,I C ] £ M fex ( fe+1 ). Then there exists an elementary vector x £ 
nul A with supp (x) = I c if and only if for each j £ I c det A[: 

(d) The conditions 1 and 3 of Lemma 5.3.4 are equivalent. 

(e) Let rank A = 1 < k. The nul A is nondegenerate if all minors of 
A of order I are nonzero. 

5. Let J be defined in Definition 5.3.5. Show 

(a) The condition (5.3.1) is equivalent to 

#(U, e pJf)>n-m + #Pfor all ± P C (t). 

(b) Assume that J satisfies (5.3.1). Let Jt+i be a subset of (n) 
of cardinality m — 1. Then J' := U {Jt+i} satisfies the m- 
intersection property if and only if 

# U 4£P (Jf n J t+1 ) > #P for all ? P C (t). 

In particular, if satisfies m-intersection property then each 
Jf n J t+ i is nonempty for i = l,...,t. Hint: Observe that 
J t c +1 U (UigpJf) decomposes to union of two disjoint sets J t c +1 
and U, eP (Jf n J t+ i). 

6. Let Si,...,S t be i nonempty subsets of a finite nonempty set S of 
cardinality t at least. S±,. . . ,St is said to have a sei of distinct rep- 
resentatives if there exists a subset {si, . . . , s t } C 5 of cardinality t 
such that Sj e for i = 1, . . . ,i. Show that if Si, . . . ,S t has a set of 
distinct representatives then 

#U i£ p5 i >#Pforall0^PC (t). 



Hall's Theorem states that the above conditions are necessary and 
sufficient for existence of a set of distinct representatives [Hal35] . 
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7. Let the assumptions of Problem 6 hold. Let G be a bipartite graph 
on a set of vertices V = (t) U S and the set of edges E C (t) x S as 
follows, (i, s) E (t) x S if and only if s G Show that Si,. . . ,S t has 
a set of distinct representatives if and only if G has a match M C 
i.e. no two distinct edges in M have a common vertex, of cardinality 
t. 

Remark: There exist effective algorithms in bipartite graphs G = 
{Vi U V 2 , E),E C Vi x V 2 to find a match of size min(#Vi, #V 2 )- 

8. Let A G F fcx ™ be nondegenerate. Show 

(a) Let 7 C (n) be of cardinality n — k+ 1. Then there exists x(7) = 

. . . , x n ) in the row space of A, with supp (x(7)) = 7, whose 
nonzero coordinates are given by Xj — (— l) Pj+1 det 7 C U {j}] 
for any j G 7, where p.,- is the number of integers in 7 C less than 

3- 

(b) Let 7 and x(7) be defined as in (a). Show that there exists a 
unique z(7) = . . . , Zk) G F fc such that x(7) = z(7)A Use 
the fact that (zA)j = o for any j G 7 C and Cramer's rule to show 
that Zi = (-l)Met A[{i} c , I c ] for i = 1, . . . , k. 

(c) Let 7i, . . . , I s c (n) be sets of cardinality n—k+1. Let x(7 1 ),z(7 1 ), . . . ,x(7 s ),z(7 s ) 
be defined as above. Then 

i. ^-(Ii), ■ • • , x(7 s ) are linearly independent if and only if z(7 1 ), . . . , z(7 s ) 
are linearly independent. 

ii. Let D = diag(— 1, 1, -1, . . .) G F fexfc . Then the matrix 

7?[z(7 1 ) T z(7 2 ) ... z(7 s )] G F fcxs isthcsubmatrixA fc ^ 1 A[;,{7 1 c ,...,7 s c }]. 
Hence z(I 1 ),..., z(7 s ) are linearly independent if and only 
if the submatrix l\ k ~ 1 A[; , {7f, . . . , 7^}] has rank s. 

iii. The submatrix h k ~ 1 A[; , {7f, . . . , 7^}] has rank s if and only 

if not all the determinants det A fe_1 A[{ii} c , . . . , {i s } c }, {{7f , . . . , 7^}] 
for 1 < i\ < %2 < ■ ■ ■ < i s < k are equal to zero. 

iv. x(7 1 ), . . . ,x(7fc) is a basis in the row space of A if and only 
if the determinant of the full row submatrix of /\ k ~ 1 A cor- 
responding to the columns determined by 7f,. . . ,7£ is not 
equal to zero. 

9. Let A G F fexn be nondegenerate. Let b 1; . . . , b„_ fc G W 1 be a basis in 
the null space of A and denote by B T G f nx ( n - k ) the matrix whose 
columns are b lr .., b„_fc. Show 

(a) Let J C (n) be of cardinality k + 1. Then there exists y(J) = 
. . . , y n ) T in the column space of £? T , with supp (y( J)) = 7, 
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whose nonzero coordinates are given by yj = (— l)' Pj+1 det B[: 
, J c U {j}] for any j <E J, where pj is the number of integers in 
J c less than j. 

(b) Let J and y(J) be defined as in (a). Show that there exists 
a unique u(J) = . . . , u n _ k ) T e F™~ fe such that y(J) = 
B T u(J). Use the fact that (i? T u) J = o for any j e J c and 
Cramer's rule to show that m = (— l)Met B[{i} c ,J c ] for i = 
1, . . . , n — k. 

(c) Let J\, . . . , Jt C (n) be sets of cardinality fc+1. Lety(J 1 ),u(J 1 ), . . . ,y(J t ),u(J t ) 
be defined as above. Then 

i. y(Ji), • ■ • , y(Jt) are linearly independent if and only if u(J ± ), . . . , u( J t ) 
are linearly independent. 

ii. Let D = diag(-l, 1, -1, . . .) e w n - kxn - k . Then the matrix 

D[u{J ± ) u(J 2 ) ... u(J t )] e F("- fc ) xt isthcsubmatrixA"-' £ - 1 B[;,{J 1 c ,..., J t c }]. 
Hence u( Ji), . . . , u( J t ) are linearly independent if and only 
if the submatrix A™ _fe_1 B[; , { Jf , . . . , J t c }] has rank t. 

iii. The submatrix A" _fe_1 -B[; , {,]{, . . . , J t c }] has rank t if and 

only if not all the determinants det A™"' £ " 1 B[{ii} c , . . . , {i t } c }, {{Jf, . . . , J t c }] 
for 1 < ^i < *2 < « — k are equal to zero. 

iv. y(J x ), ■ ■ ■ ,x(J„_fe) is a basis in the null space of A if and 
only if the determinant of the full row submatrix of A n_fe_1 i? 
corresponding to the columns determined by Jf , . . . , J^_ k _ 1 
is not equal to zero. 

10. Let C € jpnx(n-fe) k e a matrix of rank n — k. Show that there exists 
A e F fcx ™ of rank fc such that AC = 0. 

11. Let F be a field of characteristic. Let p(x\, . . . , x n ) £ ¥[xi, . . . , x n ]. 
Show that p(xi, . . . , x n ) = for all x = (i n . . . , x n ) T G F" if and 
only if p is the zero polynomial. Hint: Use induction. 

12. Let F be a field of characteristic. Assume that V = F™. Identify 
V with F". So for u e V, f e V (u, f) = f T v. Show 

(a) U C V is a subspace of dimension of n — 1 if and only if there 
exists a nontrivial linear polynomial Z(x) = a x x x + . . . + a n x n 
such that U is the zero set of Z(x), i.e. U = Z(l). 

(b) Let Uu . . . ,Ufe be k subspaces of V of dimension n — 1. Show 
that there exists a nontrivial polynomial p — Yii=i h & ¥[x\, . . . , x n ] , 
where each ^ is a nonzero linear polynomial, such that Uj = iUj 

is Z{p). 
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(c) Show that if , . . . , Ufc are k strict subspaces of V then U^ =1 Uj 
is a strict subset of V. Hint: One can assume that dim Uj = 
n — 1, i = l, . . . , k and the use Problem 11. 

(d) Let U, U 1; . . . , Ufe be subspaces ofV. Assume that U C u£ =1 U;. 
Show that there exists a subspace U, which contains U. Hint: 
Observe U = uf =1 (U, n U). 

13. Let the assumptions of Problem 12 hold. Let X = [xij], U = [uij] G 

F nx/ . View the matrices A l X,A l U as column vectors in F("). Let 
per (a:ii, . . . , x n i) := det (X T U) = (A l X) T A 1 U. View p v as a poly- 
nomial in nl variables with coefficients in F. Show 

(a) pu a homogeneous multilinear polynomial of degree /. 

(b) Pu is a trivial polynomial if and only if rank U < 1. 

(c) Let X e Gr;(V), U e Gr;(V) and assume that the column space 
of X = [xij] = [x 1; . . . ,x/], U = [uij] = [u lr ..,U(] e F" x/ are 
X, U respectively. Then 

Pu{xu, ■ ■ -,x n i) = det [ujxj]- j=1 , (x 1 A. . .Axi,u x A. . .Auj) = Z!pc/(ar 11 , . . . ,ar„;). 

In particular, dim X n = o Pu( x m ■ ■ ■ , x ni) °- 

14. Let F be a field of characteristic. Assume that V is an n-dimensional 
vector space with n > 2. Let XcV,UcV'bcm>2 dimensional 
subspaces. Assume that X £1 U^. Let x r „ e X\U ± . Let U = 
{xm}' 1 n U. Let X be any m — 1 dimensional subspace of X which 
does not contain x TO . Show 

(a) dim U = m— 1,X = X® span (x TO ). 

(b) There exists u m e U such that (x m ,u m ) = l. Furthermore 
U = U © span (u m ). 

(c) Let {x 11 ...,x ro _ 1 },{u 1 ,...,u ro _ 1 } be bases of X,Y respec- 
tively. Then 

(XiA. . .Ax ro ,u 1 Au 1 A. . .Au m ) = m(x 1 A. . .Ax TO _ 15 u^UjA. . .AUm-^, 

where u m is defined in (b). Hint: Use (5.2.8) and expand the 
determinant by the last row. 

(d) Assume that /\ m_1 X = span (y) ; /\ m_1 U = span (w). Then 
(y Ax m ,w Au„) = m(y,w>. 
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15. Let the assumptions of Theorem 5.3.10 hold. View V and V as F™. 
So for u G V,f G V (u,f) = f T v. Let X G Gr m (V) be the column 
space of X = [xij] G ]F" xm . Show that there exists a homogeneous 
polynomial Pu 1 ,...,u m (xn, . . . ,x nm ) of degree m(m — 1) such that 
X G W m (U 1 , . . . , U m ) if and only if 

Hint: Choose a basis of X to be the columns of X. Then use the 
first paragraph of Proof of Theorem 5.3.10 and Problem 13. 

16. Let F be a field and V an n-dimcnsional subspace over F. Let 
v 15 . . . , v„ be a basis of V. For ^ K C (n) let XJk = (Biexspana (vj). 
Let t < m and assume that Ki, . . . , K t C (n) be sets of cardinality 
to — 1 for any 2 < m G (n). Show that Uk,, • ■ ■ ,Ujf, satisfy the 
m-dimensional intersection property if and only if K\ , . . . , K t satisfy 
the m-intersection property. 

17. Let V be an n-dimensional vector space over F of characteristic 0. 
Show 

(a) Let 2 < to < n. If U 1; . . . , U m are to — 1 dimensional vector 
spaces satisfying the m-dimensional intersection property then 
dim J2T=i A" l_1 Ui = m - 

(b) For m = 3, n = 4 there exist U 1; U 2 , U 3 , which do not satisfy the 
3-dimensional intersection property such that dim X^=i A 2 Uj = 
3. Hint: Choose a basis in V and assume that each Uj is 
spanned by by some two vectors in the basis. 

(c) Show that for 2 < t < to = n XJ 1 , . . . , U t satisfy the n-intersection 
property if and only if dim Y^\=i A"~ = t. 

5.4 Tensor products of inner product spaces 

Let F = R, C and assume that Vj is a r^-dimensional vector space with the 
inner product (•, -)j for i = 1, . . . , k. Then Y := ®* =1 Vi has a unique inner 
product (-, ■) satisfying the property 

fe 

(5.4.1) (^^Xi,®^^-) = ^(xi.y^i, for all x,,y 4 G V l; i = i,...,k. 

i—i 

(See Problem 1.) We will assume that Y has the above canonical inner 
product, unless stated otherwise. 

Proposition 5.4.1 Let Uj, Vj be a finite dimensional IPS over F := 
R, C wzi/i £/ie inner product (-, -)u 4 , (-, /or i = l,...,fc respectively. 
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Let X := ®f =1 Uj,Y := (E>* : =1 V i be IPS with the canonical inner products 
(■> ')x, (■) -)y respectively. Then the following claims hold. 

1. Assume that Ti€L(Vi,Ui) fori = l,...,k. Then ®f =1 Tj g L(Y, X) 
ond(®{L 1 T i )* = ®? =1 2?eL(X,Y). 

2. Assume that T t g L(V,) is normal for i = 1, . . . , fc. TTien ®k =1 Tj g 
L(Y) is normal. Moreover, <8>f_ 1 T'j is hermitian or unitary, if each 
Ti is hermitian or unitary respectively. 

3. Assume that Ti g L(V 4 , U;) /or i = 1, . . . , fc. Le£ eri(T 4 ) > . . . > 
frank Ti(?i) > 0, <Tj (Tj) = 0,j > rank Ti be the singular values of Ti. 
Let djj, . . . , c nii , and d^i, . . . , d TOii j &e orthonormal bases of V, and 
Uj consisting of right and left singular vectors of Tj as described in 
(19.5): 

TiCj i i o"j 4 (T'j)dj ij j, ji 1, . . . , i l, . . . , fc. 

T/ien 

fc 

i— l 

in particular 

fc fc 

(5.4.2) || ®* =1 Till = cri(®jLi3i) = J] H T «H = n^^)' 

i=l i=l 

fc 

Cr rii = irankT i ( <X, j ; =l 7 *) = J| ^rank Ti(7i). 

We consider a fixed IPS vector space V of dimension n and its exte- 
rior products /\ k V for fc = 1, . . . , n. Since /\ fc V is a subspace of Y := 
®i =1 Vj, Vj. = . . . = Vfe = V, it follows that f\ k V has a canonical inner 
product induced by (•, -)y- See Problem 3a. 

Proposition 5.4.2 Let V, U be IPS of dimension n and m respectively. 
Assume that T G L(V , U) . Suppose that c n and d 15 d m be or- 

thonormal bases o/V and U composed of the right and left singular eigen- 
vectors of T respectively, as given in (4-9.5). Let k g Nfl [1, min(m, n)] . 
Then the orthonormal bases 

1 k 
—f=c ll A . . . A c ik e A V, i < i ± < . . . < i k < n, 
Vfc! 

1 k 
— d 3i A . . . A d jk g /\ U, i <.?!<...< jfc < m, 
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are the right and the left singular vectors of A k T e L(/\ fe V, /\ fc U), with 

the corresponding singular values Jl/Li °~n (^) an d Y\a=i a h (^) respectively. 
In particular 

k 

A k T Cl A ... Ac fe = || A fc T||d 1 A... Adj., || A fc T|| = cr 1 (A fe T) = JJ CTi (T), 

1=1 

k 

k+1 A ... A C rank T = Y]_ "rank T-k+lCOd ran k T-k+1 A ... A d rank T 

l = i 

are the biggest and the smallest positive singular value of A k T for k < 
rankT. 

Corollary 5.4.3 Suppose that V is an IPS of dimension n. Assume 
that T e S+(V). Let Ai(T) > ... > A„(T) > be the eigenvalues of 
T with the corresponding orthonormal eigenbasis c 1; ...,c„ of V. Then 
A k T e S+(A k V). Lei fc G NH [l,n]. T/ien i/ie orthonormal base ^gC il A 

... ACi k , l < ij. < . . . < ifc < n of f\ k V is an eigensystem of f\ k T, with 
with the corresponding eigenvalues Y\d=i ^h(T). In particular 

k 

A k T Cl A . . . A c fe = || A fe T|| Cl A . . . A c k , \\ A k T\\ = X 1 (A k T) = JJ A 4 (T), 

i=i 

k 

A Lc rank x— k+l A ... A C rank T = Y]_ ^rank T-k+1 (T)d ran k T-k+1 A ... A d rank T 

l = i 

are the biggest and the smallest positive eigenvalue of A k T for k < rank T. 
See Problem 4. 

Theorem 5.4.4 Let U, V, W be finite dimensional IPS. Assume that 
PeL(U,W),TeL(V,U). Then 

k k 

(5.4.3) H<n(PT) < IJ^PtoCT), k=l,... 

i=i i=i 

For k < min (rank P, rank T) equality in (5.4-3) holds if and only if the 
following condition is satisfied. There exists a k- dimensional subspace Vfe 
o/V which spanned by the first k-orthonormal right singular vectors of T , 
such that TV k is a k-dimensional subspace of U which is spanned the first 
k-orthonormal right singular vectors of P. 
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Proof. Suppose first that k = 1. Then \\PT\\ = \\PTv\\, where 
v e V, ||v|| = i is the right singular vector of PT. Clearly, ||PTV|| = 
ll-P(rv)|| < ||P|| ||Tv|| < ||P|| ||T||, which implies the inequality (5.4.3) 
for k = 1. Assume that ||P|| ||T|| > 0. For the equality \\PT\ \ = \\P\\ \\T\\ 
we must have that Tv is the right singular vector corresponding to P and 
v the the right singular vector corresponding to T. This shows the equality 
case in the theorem for k = 1. 

Assume that k > 1. If the right-hand side of (5.4.3) is zero then 
rankPT < min (rank P, rank Q) < k and a k (PT) = 0. Hence (5.4.3) triv- 
ially holds. Assume that k < min(rank P, rank Q). Then the right-hand 
side of (5.4.3) is positive. Clearly min(rank P, rank T) < min(dim U, dim V, dim W). 
Observe that A k T e L(/\ fe V, /\ fe U), A fc P e L(A fe U, A*" W). Hence (5.4.3) 
for fc = 1 applied to A k PT = A k PA k T yields a\ (A k PT) < a 1 {A k P)ai{A k T). 
Use Proposition 5.4.2 to deduce (5.4.3). In order to have a\{A k PT) = 
(Ti(A k P)ai(A k T) the operator A k T has a right first singular vector x e 
A fe V, such that ^ A fc Tx is a right singular vector of A k P corresponding 
to (7i(A fe P). It is left to show to show that x can be chosen as c 1 A . . . A c k , 
where c l7 . . . ,c k are the right singular vectors of T corresponding to the 
first fc-singular values of T. 

Suppose that 

ar{T) = . . . = a tl (T) > a h+1 (T) = . . . = a h {T) > . . . > = aj(T) for j > l p . 
(5.4.4) 

Assume first that k = U for some i < p. Then u 1 (A k T) > a 2 (A k T) and 
Ci A . . . A Cfe is the right singular vector of A fe T corresponding to ai(A k T). 
Then cri(A fe P A fc T) = P)a 1 {A k T) if and only if (A fe T) Cl A . . . A c k = 

Tc ± A. . .ATc k is the right singular vector of A fe P corresponding to cr 1 (A fc P). 
Assume that 

£71 (P) = . . . = £7 TOl (P) > £7 mi+ l(P) = . . . = £7 TO2 (p) > . . . > = £7j(P) for j > THg. 

Suppose that k — m,j-i + r, where 1 < r < rrij — rrij-i. (We assume here 
that m = 0.) Let be the subspace spanned by the m 3 _i right singular 
vectors of P corresponding to the first mj-i singular values of P and W, 
be the subspace spanned by mj — mj-i right singular vectors of P corre- 
sponding the £7 m ._ 1+ i(P), . . . , £7 TO (P). Then any right singular vector of 
A fe P corresponding to £7i(A fe P) is in the subspace ( /y" 1 ^ 1 \jA A ( A* Wj. 
Let Vfc = span (c 1; . . . , Cfe). So c ± A ... A is a nonzero vector in A* 
and (A fe T)c 1 A. . .Ac/j is a nonzero vector in A fe W 2 , where W 2 := TV^ and 
U 2 = {0}. The equality in (5.4.3) yields that (A™'- 1 A (A r W 1 ) n 
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(A°U 2 ) A (A fcw 2) ^ {0}. Lemma 5.2.11 yields that LL C TV k and 
TVfe CUj+Wj. So TVfe is spanned by the first k right singular vectors 
of P. 

Assume now that k — li-\ + s, 1 < s < Z, — Z» i . Then the subspace 

spanned by all right singular vectors of h k T corresponding to er 1 (A fe T) is 
equal to 

^/\' l_1 U 2 ^ /\ ^A S W 2 ^, where U 3 and W 3 are the subspaces spanned 
the right singular vectors of T corresponding to the first and the 
next li — k-i singular values of T respectively. Let U 2 := TU 3 ,W 2 := 

TW 3 . The equality in (5.4.3) yields that (A"*" 1 u i) A (A r W 1 ) n 
U 2 ^ /\ ^/\' s W 2 ^j contains a right singular vector of A k P corre- 
sponding to <7i{h k P). Lemma 5.2.11 yields that there exists a k dimensional 
subspace V, such that V[ D U, + U 2 and V[ cf^ + Wjn (U 2 + W 2 ). 
Hence there exists a fc-dimensional subspace V fe of U 3 + W 3 containing 
U 3 such that = TVk contains Uj and is contained in \J 1 +W\. Hence 
TVfe is spanned by the first k right singular vectors of P. 

Assume now that l\Li^( p )^( T ) > °- Then < °i{P),0 < ^(T) 
for i = 1, . . . , I. Assume that for k — 1, . . . , I equality holds in (5.4.3). We 
prove the existence of orthonormal sets c 1; . . . , C(, d 15 . . . , d( of right sin- 
gular vectors of T and P respectively such that ^^ Cfc = d^, k = 
by induction on /. For / = 1 the result is trivial. Assume that the result 
holds for I = m and let / — m + 1. The equality in (5.4.3) for k = m + 1 
yields the existence of to + 1 dimensional subspace X C U such that X is 
spanned by the first to + 1 right singular vectors of T and TX is spanned 
by the first to + 1 right singular vectors of P. □ 



Theorem 5.4.5 Let the assumptions of Theorem 5-4-4 hold. Then 
equalities in (5.4-3) hold for k = < min(rank P, rank T) if and 

only if there exits first I orthonormal right singular vectors d,...,cj of 
T, such that Ji | T ^ Tc 1 , . . . , — ^ Tc; are first I orthonormal right singular 
vectors of P. 

Proof. We prove the theorem by induction on /. For I = 1 the 
theorem follows from Theorem 5.4.4. Suppose that the theorem holds for 
I = j. Let I = j + 1. Since we assumed that equality holds in (5.4.3) 
for k = I Theorem 5.4.4 yields that there exists an subspace Z-dimensional 
subspace Vj of V which is spanned by the first I right singular vectors 
of T, and TV; is spanned by the first / right singular vectors of P. Let 
f e L(V/, TV;), P e L(TV ; , PTVi) be the restrictions of T and P to the 



262 



CHAPTER 5. ELEMENTS OF MULTILINEAR ALGEBRA 



subspaces V; , TV; respectively. Clearly 

(5.4.5) a t (T) = o-i(T) > 0, a t (P) = a t {P) > 0, for * = 1, . . . , I. 

The equalities in (5.4.3) for k = 1, . . . , I imply that a t (PT) = cr i (P)cr 4 (T) 
for i = 1,. .. ,Z. Let Q := PT G L(V ; ,PTV ( ). Clearly Q is the restric- 
tion of Q := PT to V ( . Corollary 4.10.3 yields that a^Q) < a t (Q) for 

i = 1, Since det Q = det P det T we deduce that Oi=i = 

n!=i a i{P) n!=i cr i(^ 1 )- The above arguments show that ni=i°»($) = 
Hi=i<7i(Q) > °- Corollary 4.10.3 yields that a t (Q) = a t {Q). Hence we 
have equalities Ui=i^i(PQ) = Ui=i^( p )^( T ) for i = 1, . . . ,Z. The 
induction hypothesis yields that there exist first I — 1 orthonormal right 
singular vectors of T c 1 , . . . , c;_ 1; such that — j^-Tcj, .... — ^-Tci^ ± are 

^ tri(T) t,(T) 

first Z orthonormal right singular vectors of P. Complete c 1; . . . , cj_! to an 
orthonormal basis c 1; . . . , cj of V/. Then c; is a right singular vector of T 
corresponding ai{T). Since Tci is orthogonal to Tc 1 , . . . , Tc(_ 1; which are 
right singular vectors of P it follows that — j^Tci is a right singular vector 

of P corresponding to <Ji(T). Use (5.4.5) and the fact that T and P are the 
corresponding restrictions of T and P respectively to deduce the theorem. 

□ 

In what follows we need to consider the half closed infinite interval 
[—00,00). We assume that 

—00 < a, a — 00 = —00 + a = —00 — 00 = —00 for any a G [—00, 00). 

Denote by [—00,00)^ C [— oo,oo) n the set of x = (x 1; . . . ,x n ) where x\ > 
. . . > x n > —00. 

We now extend the notions of majorizatrions, Schur set, Schur order 
preserving function to subsets of [—00,00)^. Let x = (x 1; . . . ,x n ),y = 
(j/u . . . , y n ) G [—00,00)^. Then x < y, i.e. x is weakly majorized by y, 

if the inequalities X)i=i x i — Si=i Hi hold f° r i = 1, ■ ■ ■ ,n. x -< y, i.e. x 
majorized by y, if x ^ y and x i = 127=1 f»- A set P C [—00, 00)!^ is 

called Schur set if for any y G D and any x -< y x G D. 

Let / C [—00, 00) be interval, which may be open, closed or half closed. 
Denote by I the interior of I. f : I — > R is called continuous if f\Io and 
continuous. If a G [—00, 00) is an end point of / then / is continuous from 
the at a from the left or right respectively. Suppose that —00 G I. Then 
/ : / — > R is called convex on / if / is continuous on / and a nondecreasing 
convex function on I . (See Problem 6.) / is called strictly convex on I if / 
continuous on / and strictly convex on I . If — 00 G I then / is continuous 
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on /, and is an increasing strictly convex function on I a . Note that the 
function e x is a strictly convex function on [—00, 00). 

Let D C [—00,00)". Then / : D — > R is continuous, if for any x e £> 
and any sequence of points Xfc e P/, fc e N the equality lim^oo /(xfe) = 
/(x) holds if linife^oo Xfe = x. D is convex if for any x, y e D the point 
fx + (1 — f)y e D for any f e (0, 1). For a convex D, f : D — > R is 
convex if / is continuous and /(fx + (1 — f )y) < f/(x) + (1 — f )/(y) for any 
f e (0, 1). For a Schur set D C [-00, 00)™ / : D — ► R is called if Schur order 
preserving, strict Schur order preserving, strong Schur order preserving, 
strict strong Schur order preserving if / is a continuous function satisfying 
the properties described in the beginning of §4.7. It is straightforward to 
generalize the results on Schur order preserving functions established in 
§4.7 using Problem 7. 

Let the assumptions of Theorem 5.4.4 hold. For any k e N let 
(5.4.6) 

trfc(T) := (ai(T),...,cT fe (T)) eR^, \oga k := (logoi(T), . . . , log<7fc(T)) e [-oo,oo) fe . 

Theorem 5.4.4 yields 

log (T k (PT) d log (T k (P) + log (T k {T) for any k e [l,max(rank P,rank T)] 
(3ofe73) fe (PT) -< log a k (P) + log <r fe (T) for fc > max(rank P,rank T), 
logo- fe (PT) -< log a k (P) + log cr k (T) if k = rank P = rank T = rank PT. 

See Problem 8. 

Theorem 5.4.6 Let U, V, W) be IPS. Assume that T e L(V, U),Pe 
L(U,W) and I e N. 

1. Assume that D C [—00,00)!^ be a strong Schur set containinglog<Ti(PT),log(Ti(P)+ 
log <t;(T). Lef ft, : Z? — > R 6e a strong Schur order preserving func- 
tion. Then h(loger [(PT)) < h(log a t(PT) + log (Ti(PT)). Suppose 
furthermore that h is a strict strong Schur order preserving. Then 

equality holds in the above inequality if and only if equality holds in 
(5.13) fork = 

2. Assume that log er i(PT) -< log er ( (P)+log <T;(T), and D C [-00,00)^ 
6e a Schur set containing log er; (PT) , log erj (P) + log <T((T). Let h : 
D — > R &e a Schur order preserving function. Then fi(log <r ; (PT)) < 
fi(log (Ti(PT) + log <T;(PT)). Suppose furthermore that h is a strict 
Schur order preserving. Then equality holds in the above inequality if 
and only if equality holds in (5.4-3) for k = 1, . . . , I — 1. 

See Problem 9. 
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Corollary 5.4.7 Let U, V, W) be IPS. Assume that T e L(V, U), P e 
L(U,W) and I e N. isswme tftaf log<7;(PT) d log <x ( (P) + log er ; (T) , and 
/ C [—00,00) is an interval set containing log cr 1 (P)+log cr 1 (T), log <7/(PT). 
Le£ /i : / — > R oe a convex function. Then 

1 1 

(5.4.8) ^ /iQog ^(PT)) < ^ /i(log a,(P)) + fc(log ^(T)). 

i=l i=l 

Corollary 5.4.8 Le£ U, V, W) &e IPS', vlssnme that T E L(V, U), P e 
L(U, W) and J e N. T/ien /or any t > 

(5.4.9) 5>(PT)' > ^(P^CT)'. 

i=l t=l 

equality holds if and only if one has equality sign in (5.4-3) for k = 1, . . . , I. 

Proof. Observe that the function h : [—00, 00)^ — > R given by 

/i((xi, . . . , xi)) = Y^i=i etXi i s a strictly strongly Schur order preserving for 
any t > 0. □ 

The following theorem improves the results of Theorem 4.10.12. 

Theorem 5.4.9 Let V be an n- dimensional IPS vector space over C 
and assume that T G L(V). Let Ai(T), . . . , A„(T) E C be the eigenvalues 
ofT counted with their multiplicities and arranged in order |Ai(T) > . . . > 
|A„(T)|. Let A (T) := (|Ai(T)|, . . . , |A„(T)|) and \ a ,k(T) := (\Xi(T)\, . . . , |A fc (T)|) 
for k — 1, . . . ,n. Then 
(5.4.10) 

£ l n n 

~[[\K(T)\ <Y[o-i(T) forl = l,...,n-l, and J] |A;(T) | = J] CTi (T). 

i— 1 i— 1 i— 1 i= 1 

Por i = 1, . . . , k < n equalities hold in the above inequalities if and only if 
the conditions 1 and 2 of Theorem 4. 10. 12 hold. 

In particular log A aj fc(T) -< log <r fc (T) fork = 1, . . . , n— 1 and log A a (T) -< 
log <r(T). 

Proof. By Theorem 4.10.12 |Ai(a'T)| < cti(a'T). Use Problem 7 
and Proposition 5.4.2 to deduce the inequalities in (5.4.10). The equality 
U7=i = n'=i CT *( T ) is equivalent to the identity |det T\ 2 = dct TT* . 

Suppose that for I = l,...,k < n equalities hold in (5.4.10). Then 
\Xi(T)\ = Ui{T) for i = l,...,k. Hence equality holds in (4.10.14). The- 
orem 4.10.12 implies that conditions 1,2 hold. Vice versa, assume that 
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the conditions 1,2 of Theorem 4.10.12 hold. Then from the proof of The- 
orem 4.10.12 it follows that \Xi(T)\ = a { {T) for i = 1, . . . , k. Hence for 
I = 1, . . . , k equalities hold in (5.4.10). □ 



Corollary 5.4.10 Let V be an n dimensional IPS. Assume that T E 
L(V). 

1. Assume that fee [1, n — 1] fl N and D C [— oo, oo)^ be a strong Schur 
set containing log <x fe (T). Let h : D — > M be a strong Schur order 
preserving function. Then /i(log Afe(T)) < /i(log <7fe(T)). Suppose 
furthermore that h is a strict strong Schur order preserving. Then 
equality holds in the above inequality if and only if equality holds in 
(54.10) fori = l,...,k. 

2. Let I c [—00,00) be an interval containing log cti(T), log <7fc(T), log | \k{T) | . 
Assume that f : I — > R is a convex nondecreasing function. Then 

ELi/( lo sl A *™ < ELi/( lo sM T )l)- 7 // 15 « s ^ c % co ™ ea; 

increasing function on I then equality holds if and only if equality 
holds in (5.4-10) for I = 1, . . . , k. In particular for any t > 

k k 

(5.4.11) ^(T)!*^ J>(T)*. 

i=l i=l 

Equality holds if and only if equality holds in (5.4-10) for I = 1, . . . , k. 

3. Assume that D C [—00, 00)^ is a Schur set containing log cr„(T). Le£ 
ft, : D — > R 6e a Schur order preserving function. Then /i(log A a (T)) < 
ft(logcr ra (T)). Suppose furthermore that h is a strict Schur order pre- 
serving. Then equality holds in the above inequality if and only if 
equality holds T is a normal operator. 



Problems 

1. Let Vj be an n^-dimensional vector space with the inner product 
for i = 1, . . . , k. 

(a) Let e^j, . . . ,e ni) j be an orthonormal basis of Vj with respect 
(•, -)i for i = 1, . . . , k. Let (•, •) be the inner product in Y := 
<8>i =1 Vj such that ^^e^ where ji = 1, . . . , nj, i = 1, . . . , k is 
an orthonormal basis of Y. Show that (5.4.1) holds. 
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(b) Prove that there exists a unique inner product on Y satisfying 
(5.4.1). 

2. Prove Proposition 5.4.1. 

3. Let V be an n-dimcnsional IPS with an orthonormal basis e l7 . . . ,e„. 
Let Y := ®f =1 Vj, Vj = . . . = V fe = V be an IPS with the canonical 
inner product (-,-)y- Show 

(a) Let k e N n [l,n]. Then the subspace /\ k *V of Y has an or- 
thonormal basis 

1 

—=e ii A . . . A e ik , l < i 1 < i 2 < . . . < i k < n. 
Vfc! 

(b) Let k G N. Then the subspace Sym k V of Y has an orthonomal 
basis a(h, . . . ,i k )sym k (e ii , . . . ,e ik ), l < i x < . . . < i k < n. The 
coefficient a{i\, . . . ,i k ) is given as follows. Assume that ii — 
. . . = i h < i h+1 = ... = ii 1+ i 2 < ... < H 1 +...+i r _ 1 +i = ... = 
ih+...+i r , where h + ... + l r = k. Then a{h, ...,i k ) = ^=J==. 

4. (a) Prove Proposition 5.4.2. 
(b) Prove Corollary 5.4.3. 

5. Let U, V be IPS of dimensions n and m respectively Let T € L(V, U) 
and assume that we chose orthonormal base [c 1; . . . , c„], [d 1; . . . , d TO ] 
of V, U respectively satisfying (4.9.5). Suppose furthermore that 

a^T) = ... = <j h (T)> a h+1 (T) = ... = a h (T) >...> 
(5.4.12) 

<ri p _ 1+1 (T) = ... = a lp (T)>0,l<h<...<l p = rank T. 
Let 

(5.4.13) V i :=span(c ii _ 1+1 ,...,cj 4 ), i = l, . . . ,p, l a := o. 

(a) Let k = h for some i e [l,p]. Show that <Ji(A k T) > <J 2 (A k T). 
Furthermore the vector c x A . . . A c k is a unique right singular 
vector, (up to a multiplication by scalar), of A k T corresponding 
to <j\{A k T). Equivalently, the one dimensional subspace spanned 
by the the right singular vectors of A k T is given by /\ k ©J =1 Vj. 
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(b) Assume that U — > 2 and < k < k for some 1 < i < p. 
Show that 

(5.4.14) ai(A*T) = . . . = Y-^JT) > ^-i^n CT). 

The subspace spanned by all right singular vectors of h k T cor- 
responding to (T 1 (A fc T) is given by the subspace: 

U-l k-li-r 

(A©5=i v j)A( A v *)- 

6. Let 7 := [— oo, a), a g M and assume that / : 7 — > M is continuous. / 
is called convex on 7 if f(tb + (1 — t)c) < t/(6) + (1 — t)f(c) for any 
&, c g 7 and i g (0, 1). We assume that t(— oo) = — oo for any t > 0. 
Show that if / is convex on 7 if and only / is a convex nondecreasing 
bounded below function on 7 D . 

7. D C [— oo, oo)!^ such that £)' := D n K" is nonempty. Assume that 
/ : T> — > M is continuous. Show 

(a) D is a Schur set if and only if D' is a Schur set. 

(b) / is Schur order preserving if and only if f\D' is Schur order 
preserving. 

(c) / is strict Schur order preserving if and only if f\D' is string 
Schur order preserving. 

(d) / is strong Schur order preserving if and only if f\D' is strong 
Schur order preserving. 

(e) / is strict strong Schur order preserving if and only if f\D' is 
strict strong Schur order preserving. 

8. Let the assumptions of Theorem 5.4.4 hold. Assume that rank P = 
rank T = rank PT. Let k = rank P. Show that the arguments of the 
proof of Theorem 5.4.5 implies that T\ h l=1 <Ji(PT) = T\f =1 cr i (P)c7 i (T). 
Hence log er & (PT) -< log er & (P) + log <x & (T) . 

9. Prove Theorem 5.4.6 using the results of Section 4.7. 

10. Show that under the assumptions of Theorem 4.10.8 one has the in- 
equality Y?i=i <ri(S*TY < Y? i=1 cr 4 (5)*cr J (T)* for any I g N and t > 0. 

11. (a) Let the assumptions of Theorem 5.4.9 hold. Show that (5.4.10) 

imply that A,(T) = for i > rank T. 
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(b) Let V be a finite dimensional vector field over the algebraically 
closed field F. Let T e L(V). Show that the number of nonzero 
eigenvalues, counted with their multiplicities, does not exceed 
rank T. (Hint: Use the Jordan canonical form of T.) 



5.5 Tensor products of exponents 

Proposition 5.5.1 (The Lie-Trotter formula.) Let a > 0, A(t) : (—a, a) 
C nxn , assume that 

(5.5.1) lim ^$-=B. 



t^o t 



Then for any s E 



sB 



(5.5.2) lim(/ + ^))T=e 

Proof. The assumption (5.5.1) yields that B(t) := \A{t),t ^ 0, B(0) := 
B is continuous at t = 0. Hence, there exists S > such that for \t\ < S 
\\B(t)\\2 = <7i (-B(t)) < c. Without loss of generality we can assume that 
cS < \. Hence, for \t\ < 5 all the eigenvalues of A(t) are in the disk \z\ < \. 
Consider the analytic function log(l + z) in the unit disk \z\ < 1 with 
log 1 = 0. The results of §3.1 that for \t\ < S 



DC 



i=l i=1 



Recall that 



i=2 



v Mi*rW)ii 2 MH^-iog(i-i^) 
^ i - \t\ 



i=2 



Hence 



lim(7 + A(t))i = cxp(lim - log (7 + A(t)) = e s 



Proposition 5.5.2 Let V, is a rif dimensional vector space for i 
l,...,k over F = R, C. Let Y := ®f =1 Vj. Assume that A, e L(Vj), i 
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1, . . .. Then 

®\ =1 e Mt = e^ 1 --^*)® * G L((g)f =1 Vi), w/iere 

(5.5.3) 

k 

»=1 

/or any tgF. 
See Problem 1. 

Definition 5.5.3 Let V be a n- dimensional vector space over F. As- 
sume that A G L(V). Denote by A A k the restriction of ( A, . . . , A )® to 

A fe V. 

Corollary 5.5.4 Let the assumptions of Definition 5.5.3 hold for ¥ = 
R,C. T/ien A fc e At = e^' /or any t € F. 

Definition 5.5.5 ^4 subspace U C H„ is ca/terf a commuting subspace 
if any two matrices A,B g\J commute. 

Recall that if A, B G H„ then each eigenvalue of e A e B are positive. (See 
Problem 4.) 

Theorem 5.5.6 Let U, V C H„ be two computing subspaces. Then the 
functions 

k 

(5.5.4) /feiUxV^M, f k (A,B):=Y / lo S X i (e A e B ),k=i,...,n, 

i=i 

are convex functions on U x V. ("ZTie eigenvalues of e A e B are arranged in 
a decreasing order.) 

Proof. Since e A e B has positive eigenvalues for all pairs A,Bg H„ 
it follows that each f k is a continuous function on U x V. Hence it is 
enough to show that 

(5.5.5) 

f k ( l -{A 1 +A 2 ), 1 -{B 1+ B 2 )) < ^(f k (A 1 ,B 1 )+f k (A 2 ,B 2 )), k = 1, . . . ,n, 
for any A 1: A 2 e U, B Xl B 2 E V. (Sec Problem 5.) 
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We first consider the case k = 1. Since A\A 2 = A 2 A\,B\B 2 = B 2 B\ it 
follows that 

e h(A 1+ A 2 ) e ±(B 1+ B 2 ) = e \A 2e \A le \B le lB 2 

Observe next that 

e ~\ A2 (t=h A2 e h Ai e \ Bl e \ B2 ^ e h A2 = e^ Al e^ Bl e^ B2 e^ A2 => 
fi^Ai + A 2 ), + B 2 )) = X^e^e^e^e^ 2 ). 

Hence 

(5.5.6) Ai(e5 Al e5 Sl e5 S2 e3 A2 ) < ^(e'^e^'es^e^ 2 ) < 

£7 1 (e5 Al e5 Sl )o-i(e5 B2 e3 A2 ) = Ai(ei Al e5 Bl e5 Bl ei Al )Hi(e5 A2 e5 B2 e 5 B2 e5 A2 )5 = 
A 1 (e^ le ^ 1 e5 B le ^ 1 )^A 1 ( e ^ 2 e ^ 2 e 5 B2 e ^ 2 )^A 1 ( e Al e Bl )5A 1 ( e A2 e B2 )i 

This proves the convexity of /i . 

We now show the convexity of f k . Use Problem 6 to deduce that we may 
assume that U = UT> n (R)U*, V = VT> n (R)V* for some U, V e U(n). Let 
Ufe, Vfe C be two commuting subspaces defined in Problem 6(d). The 

above result imply that g : U fe x V k — ► M given by g(C, D) = log(e c e £) ) is 
convex. Hence 

0(^((^i)a* + (^2)a»). ^((S 1 ) Afc +(B 2 ) Afc ) < i( ff ((Ai) A », (Bi) A *)+</((Ai) A », (Bi) A * 
The definitions of (A, . . . , A)® and A A k yield the equality \{A A k + B A k) = 

k 

{\{A+B)) A k . Use Problem 3 to deduce that the convexity of g implies the 
convexity of f k . □ 

Theorem 5.5.7 Let A, B e H„,fce [i,n]nN. Assume that f k (t A, tB),t e 
R is defined as in (5.5.4)- Then the function M^AA 8 } increases on (0, oo). 
In particular 

k k 

(5.5.7) ^A 4 (A + B)<^logA,(e A e s ), k = l,...,n, 
»=i »=i 

(5.5.8) tre A+B <tr{e A e B ).. 

Equality in (5.5.8) holds if and only if AB = BA. 
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Proof. Theorem 5.5.6 yields that gk(t) ■= fk{tA,tB) is convex on 
R. (Assume U = span (A), V = span (B)). Note that g k (0) = 0. Prob- 
lem 8 implies that nondecreasing on (0,oo). Problem 9 implies that 

lim tN0 = h{A + B). Hence (5.5.7) holds. Use Problem 7 to 

deduce that in (5.5.7) equality holds for k — n. Hence (5.5.7) is equivalent 
to 

(5.5.9) X(A + B)< log X(e A e B ). 

Apply the convex function e x to this relation to deduce (5.5.8). 

We now show that equality holds in (5.5.8) if and only if AB = BA. 
Clearly if AB = BA then e At e Bt = e (- 4 + B )* ; hence we have equality in 
(5.5.8). 

It is left to show the claim that equality in (5.5.8) implies that A and 
B commutes. We prove this claim by induction on n. For n — 1 this 
claim trivially holds. Assume that this claim holds for n = m — 1. Let 
n = m. Since e x is strictly convex it follows that equality in (5.5.8) yields 
the equality 

AB = BA hence we have equality in (5.5.8) \{A + B) = \og\(e A e B ) 
in particular X(A + B) = log \i(e A e B ). Hence ^-jp- is a constant function 
on (0,1], i.e. g\(t) = Kt for t G [0,1]. Use the inequalities (5.5.6) for 
k = l,Ai = At,A 2 = A,B X = Bt,B 2 = B,t e (0,1) to conclude that we 
must have equalities in all inequalities in (5.5.6). In particular we first must 
have the equalities \\{e A e B ) — a\(e A e B ). Similarly we conclude that 

k k 

JJ K{e A e B ) = J] a t {e A e B ), i = l,...,n=> K{e A e B ) = a t {e A e B ), i = 1, . . . , n. 
i=i »=i 

Theorem 4.10.12 yields that e A e B is a normal matrix. 

Assume first that all the eigenvalues of e A e B are equal. Hence e A e B = 
cl => e A e B = e B e A => AB = BA. Assume now that e A e B has ^-distinct 
eigenvalues 71 > . . . > 7; > 0. Let W, be the eigenspace of e A e B corre- 
sponding to 7j. Clearly e B Wj is the eigenspace of e B e A corresponding 7^ 
Hence e B W, = W, => BW, C Wj. Similarly e%, =W,4 M, C W ; . 
Since e A e B \-w i = jiI\Vi it follows that AB\w i = BA\w i for i = 1, . . . , k. 
Hence e A e B = e B e A AB = BA. □ 

Let 

(5.5.10) C(t) := l - log e ^ At e Bt e^ At € H„, te R\{o}. 
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tC{t) is the unique hermitian logarithm of a positive definite hermitian 
matrix e^ At e Bt e^ At , which is similar to e At e Bt . Proposition 5.5.1 yields 

(5.5.11) lim (7(f) = (7(0) := A + B. 

(See Problem 11.) In what follows we give a complementary formula to 

(5.5.11) . 

Theorem 5.5.8 Let A,Be H„ and assume that (7(f) is be the her- 
mitian matrix defined as (5.5.10). Then ^Z i=l ^i{C{t)) are nondecreasing 
functions on [0, oo) for k — 1, . . . ,n satisfying 

(5.5.12) A(C(f)) ■< X(A) + X(B). 
Moreover there exists C € H„ such that 

(5.5.13) lim (7(f) = (7, 

t— >oo 

and C commutes with A. Furthermore there exist two permutations <p, ip 
on {1, ... , n} such that 

(5.5.14) Ai(<7) = X^A) + X m {B), i = l,...,n. 

Proof. Assume that f > and let Aj(f) = e tAl(c(t)) , i = 1, . . . ,n be 
the eigenvalues of (7(f) := es ^* e -B* e 2 A *. Q ear iy 

Ai(f) = || e 5 A * e S *e5 At || 2 < ||e5 A *||2|| e B *||2 = e (Ai(A)+A 1 (B))*_ 

By considering A k G(t) we deduce 

JjA i (t)<e t Sf=iA.W+Ai(B) ) fc = !,...,„, t> . 

Note that for A; = n equality holds. (See Problem 7.) Hence (5.5.12) holds. 
Let <7fc(f) be defined as in the proof of Theorem 5.5.7. Clearly = 
Si=i ^i( ( 7(^))- Since is nondecreasing we deduce that Yli=i 
is nondecreasing on [0, oo). Furthermore (5.5.12) shows that 9fc ^ is bounded 
Hence lim^^ 9k ^ exists for each k = 1, . . . , n, which is equivalent to 

(5.5.15) lim Ai((7(f)) = Wi, i = l,...,n. 

t—>oo 
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Let 

(5.546}= ... = u> ni > uj ni +i = ... = w n2 > ... > w„,_ 1+ i = . . . u>„ r , 

no = < rii < . . . < n r = n. 

Let ujq := u>i + l,u> n +i = w„ — 1. Hence for f > T the open inter- 
val ; "i+^i+i ^ contains exactly — nj_i eigenvalues of C(t) for 
z = l,...,r. In what follows we assume that t > T. Let G H„ 
be the orthogonal projection on the eigenspace of C(t) corresponding to 
the eigenvalues X ni _ 1+ i(C(t)), . . . , X ni (C(t)) for i = 1, . . . ,r. Observe that 
Pi(t) is the orthogonal projection on the eigenspace of G(t) the eigenspace 
corresponding to the eigenvalues X ni _ 1+ i(t), . . . , X ni (t) for i = 1, . . . , r. The 
equality (5.5.13) is equivalent to 

(5.5.17) KmPi(i) = Pi, z = l,...,r. 

i — >oo 

The claim that CA = AC is equivalent to the claim that APi = PiA for 
i= l,...,r. 

We first show these claims for i = 1. Assume that the eigenvalues of A 
and B are of the form 

Xi(A) = ... = X h (A) = a 1 > X h+1 (A) = . . . = X h (A) = a 2 > . . . > 

K- 1+ M) = ■ ■ ■ = \{A) = a pi 
X 1 (B) = ... = X mi (B) = ft > X mi+1 (B) = . . . = X m2 (B) = fa > . . . > 

A m ,_i+i(-B) = . . . = X mq (B) = p q , 
(5.5.18) Iq — < l\ < . . . < l p = n, m = < mi < . . . < m q = n. 

Note that if cither p — 1 or q = 1, i.e. either A or _B is of the form a/, 
then the theorem trivially holds. Assume that p, q > 1. Let Qi,Rj be the 
orthogonal projections on the eigenspaces of A and B corresponding to the 
eigenvalues on and (3j respectively, for i = 1, . . . ,p, j = 1, . . . ,q. So 

e* At = J2^ ait Qi, e Bt = £> ft '^, G(t) = ]T e^+^+^Q^RjQ^. 

*=1 J=l »i=i2=j=l 

Observe next that 

rank QjRj = rank (Q ; Rj)* = rank RjQ; = rank (QiRj)(Q ; Rj)* = rank QiR?Qi = 
Qi 1 RjQi 2 = (Qi 1 Rj)(RjQi 2 ) ^ =^> Qi 1 RjQi 1 ^ 0,Qi 2 RjQi 2 ^ 0, 
K := G {1, . . . ,p} x {1, . . . , q}, QiRj ? 0}, 

i = (^Qi^Ri) = QiRj = E 

*=i j'=i *,j=i (*,j)e/c 
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(See Problem 14.) Let 

(5.5.19)yi:= max cti+pj, /Ci := {(i, j) e JC,at + = 71}, 
(»,j)e/c 

n'j = >^ rank QjRjQi, 7i = max oti + j3- } ,. 
,.rr?r (i,j)e/c\/Ci 

From the above equalities we deduce that JC\ 7^ 0. Assume that (*',.?') € 
/Ci arc distinct pairs. From the maximality of 71 and the definition of ICi 
it follows that i ^ ^ j'. Hence QiRjQi(Qi<Rj>Qi>) = 0. Furthermore 
-f[ is well defined and j[ < 71 . Let 

D 1 (t):= Yl e^ +l3 ^ t Q i R j Q i + 
(i,j)e/cyCi 

(5.5.20) 

D = Q^ R ]Qn D(t) = D + D 1 (t). 

Then = rank D. (See Problem 15b). We claim that 

(5.5.21) <*>i=7i, rii = n^. 

From the above equalities and definitions we deduce G(t) = e 7lt D(t). 
Hence X,(t) = e 7lt Xi(D(t)) . As each term e h(^ 1 +a, 2 +2p ] -2j 1 )t app caring in 
£>i(t) is bounded above by e - ^ 71-7 ^* we deduce £>i(t) = e5( 7 ^ 7l )*D 2 (t) 
and < ||£) 2 (i)||2 < Hence lim^oo £>(t) = _D. Since rank D = n' x we 
deduce that we have 0.5A 4 (D) < \i{D(t)) < 2X t (D) for i = l,...,n[. If 
rii < n then from Theorem 4.4.6 we obtain that 

Xi(D(t)) = XiiD + D^t)) < Ai(D) + Ai(£>i(t)) - Ai(Di(t)) < e^" 71 )'*:, 
for i = ni + 1, • • • , n. Hence 

Ui = 71, i = 1,.. . ,n[, Wi < ^(7 + 7i), i = n' 1 + l,...,n, 

which shows (5.5.21). Furthermore lim^oo Pi (t) — Pi, where Pi is the 
projection on DC™. Since QiQv — Su'Qi it follows that Qi'D — DQii for 
i' = 1, ... ,39. Hence AD = DA => AP 1 = Pi A. Furthermore PiC™ is a 
direct sum of the orthogonal subspaces QiPjQjC n ,(i,j) G JCi, which are 
the eigen-subspaces of A corresponding to Xi(A) for € K\. 
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We now define partially the permutations <f>, ip. Assume that K-i = 
{{ii,ji),...,{io,jo}}- Then wi = 71 = a lk + jk for k = l,...,o. Let 
e = and e fe = e fc _! +rank Qi k Rj k Q; k for k = 1, . . . , o. Note that e a = n\. 
Define 
(5.5.22) 

4>{s) = l ik -i+s-e k -i, i>{s) = m jk -\+s-e k -i, for s = e k -i+l, ...,e k , k = l,...,o. 

Then Wi = w s = A 0(s) (A) + A^ (s) (B) for s = 1, . . . , m. 
Next we consider the matrix 

G 2 = A" 1 + 1 G(t) = A ni + l e h At e Bt^ At = e lA Ani+1 t e B Ani+1 t e iAA»i + H m 

So Ai(G 2 (i)) = n™^ 1 Ai(i) an< ^ more generally all the eigenvalues of G 2 (t) 
are for the form 

ni+l 

II A J'i (*) • • • A J"i+i 1 - h <h< ■■■< 3m+i < n. 

i=l 

Since we already showed that lim^oo lo g-^W = cj, for i = 1 , ... n we 
deduce that 

i=l 

Hence all the eigenvalues of G-2(t)^ converge to the above values for all 
choices of 1 < j\ < j 2 < ... < j ni +i < n. The limit of the maximal 
eigenvalue of G 2 (t)~ is equal to e < "i+-+ < "ni+<«'< for i = m +1, . . . , n 2 , which 
is of multiplicity n 2 — n\. Let P 2 ,i(f) be the projection on the eigenspace 
of G 2 (t) spanned by the first n 2 — n\ eigenvalues of G 2 (t). Our results for 
G(t) yield that lirn^oo P 21 (i) = P2.i1 where P 2i i is the projection on a 
direct sum of eigen-subspaces of A A m+i. Let W 2 (t) = P 1 (t)C n + P 2 (f)C" 
be a subspace of dimension n 2 spanned by the eigenvectors of G(t) corre- 
sponding to Xi(t), . . . ,X n2 (t). Then P 2 ,i(t) A 111+1 C™ is the the subspace 
of the form (A" 1 Pi(t)C") f\(P 2 (t)C n ). Since lim^^ Pi(i)C™ = PiC" and 
lim^oo P 2a (t)A" 1+1 C" = P 2il A" 1+1 C" we deduce that lim^^ P 2 (i)C n = 
W 2 for some subspace of dimension n 2 — n\ which is orthogonal to PiC". 
Let P 2 be the orthogonal projection on W 2 . Hence lim t ^ 00 P 2 (t) = P 2 . 
(See for details Problem 12.) 

We now show that that there exists two permutations <f>, ip on {1, . . . , n} 
satisfying (5.5.22) such that uii — a^i) + for i = n\ + l,...,n 2 . 

Furthermore AP 2 = P 2 A. To do that we need to apply carefully our results 
for uii, . . . ,u> ni - The logarithm of the first n 2 — m limit eigenvalues of G 2 (t) 1 
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must be of the form X a (A A n 1 +i )) + Xb(B A n 1 +i). The values of indices a and 
b can be identified as follows. Recall that the indices (f>(i),i — 1, . . . , n\ in 
<-^i = -^0(i)(^4) + ^ti>(i)(B) can be determined from the projection P 1; where 
Pi is viewed as the sum of the projections on the orthogonal cigcn-subspaccs 
QiRj£ n , € £i. Recall that P 2a A™ 1+1 C™ is of the from A Ill (PiC") A 
(P 2 C n ). Since P 2 C™ is orthogonal to PiC" and A ni (PiC™) A (P 2 C") is an 
invariant subspace of A A »i+i it follows that P 2 C" is an invariant subspace 
of A. It is spanned by eigenvectors of A, which are orthogonal to PiC" 
spanned by the eigenvectors corresponding A^,(j)(A),i = 1, . . . ,n\. Hence 
the eigenvalues of the eigenvectors spanning P 2 C™ are of the form Afc(A) 
for k G Z 2 , where J 2 C {1, . . . , n}\{(j>(l), . . . , ^(ni)} is a set of cardinality 
n 2 — n\. Hence P^Qi = QiPi,i = 1, • • ■ ,p, which implies that P 2 A = AP 2 . 

Note that A Q ((A A „ 1+ i)) = J^jLi ^<p(j)( A ) + ^k(A) for k e X 2 . Since 
Gi(t) = e^e^es" 4 * is similar to the matrix H\(t) :— ei Bt e At e^ At we 
can apply the same arguments of H 2 (t) := A" 1+1 Pi(i). We conclude that 
that there exists a set Ji C {1, . . . , n}\{ip(l), . . . , -0( n i)} is a set of car- 
dinality n 2 — ni such that Af,((P Ani +i )) = Y^jLi ^i/>(j){B) + Xy{B) for 
k' E Ji- Hence the logarithm of the limit value of the largest eigenvalue 
of G 2 (i) * which is equal to lu\ + ... + u) ni + u>m+i is given by n 2 — ri\ 
the sum of the pairs X a (A A n 1 +i)) + Xf,(B Ani +i). The pairing (a, b) in- 
duces the pairing (k, k') in X 2 x Ji. Choose any permutation <f> such that 
0(1), . . . , 4>{n\) defined as above and {<p(ni + 1), . . . 0(n 2 )} = X 2 . We de- 
duce the existence of a permutation ip, where ip(l), ■ ■ ■ , "0( n i) be defined as 
above, {ip(ni + 1), . . . , ^(712)} = 1/2, and (<f>(i),ip(i)) is the pairing (k,k') 
for i = 774 + 1, . . . , n 2 . This shows that u>i = X^r^iA) + X^,^(B) for 
i = ni + 1, . . . , ri2- By considering the matrices A" i+1 G(i) for z = 2, . . . , r 
we deduce the theorem. □ 



Problems 

1. Prove Proposition 5.5.2. (Hint: Show that the left-hand side of 
(5.5.3) is one parameter group in t with the generator (Ai, . . . , A/.)®.) 

2. Let the assumptions of Proposition 5.5.2 hold. Assume that 

\(Ai) = (Xi(Ai), . . . , X ni (Ai)) for i = 1, . . . , k. Show that the n\ . . .n k 
eigenvalues of (A l7 . . . , A k )® are of the form X^=i ^i,- (^i)i where 

ji = 1 j ■ ■ ■ 1 n i j i = 1 , . . . fc . 

(Hint: Recall that the eigenvalues of ®f =1 e Ait are of the form n»=i e Xj ^ Ai>)t .) 
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3. Let the assumptions of Definition 5.5.3 hold for F = C. Assume 
that X(A) = (Ai, . . . , A„). Show that the eigenvalues of A A k are 
X il + . . . + X ik for all 1 <?!<...< i k < n. 

4. Let A,B £ C nxn . 

(a) Show that e A e B is similar ei A e B e^ A . 

(b) Show that if A, B £ H„ then the eigenvalues all the eigenvalues 
of e A e B are real and positive. 

5. Let g £ C[a,b]. Show that the following are equivalent 

(a) gilix-L + x 2 )) < |(ff(xi) +g(x 2 )) for all x 1 ,x 2 £ [a,b]. 

(b) g(t 1 x 1 +t 2 x 2 )) < t 1 g(x 1 ) + t 2 g(x 2 ) for allt 1 ,t 2 £ [0, l],h+t 2 = 1 
and Xi,x 2 £ [a, 6]. 

Hint: Fix x\,x 2 £ [a,b]. First show that (a)=>(b) for any t\,t 2 £ 
[0,1] which have finite binary expansions. Use the continuity to de- 
duce that (a)=>(b). 

6. (a) Let D„(E) C H„ be the subspace of diagonal matrices. Show 

that D„(R) is a maximal commuting subspace. 

(b) Let U C H„ be a commuting subspace. Show that there exists 
a unitary matrix U £ U(n) such that U is a subspace of of a 
maximal commuting subspace lTD n (M.)U* . 

(c) Show that a commuting subspace U C H„ is maximal if and 
only if U contains A with n distinct eigenvalues. 

(d) Let U C H„ be a commuting subspace. Show that for each 
k £ [l,n]nN the subspace Ufe := span (A A k : A £ U) is a 
commuting subspace of . 

7. Let A,B £ H„ and assume that f n (A,B) is defined as in (5.5.4). 
Show that f n (A, B) = tr(A + B). Hence f n (A, B) is convex on H„ x 
H„. 

Remark. We suspect that fk ■ H„ x H„ — > R defined as (5.5.4) is 
not a convex function for k = 1 , . . . , n — 1 . 

8. Let g : [0, oo) — > R be a continuous convex function. Show that 
if g(0) = the the function ^p- nondecreasing on (0,oo). (Hint: 
Observe that g(x) < ^g{y) + (1 - |).g(0) for any < x < y.) 

9. Let A,B £ C nxn . 
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(a) Show lim^oo \{e At e Bt -I) = A + B. 

(b) View (e At e Bt )T as (7+ (e At e Bt - 1)) i . Show lim t ^ (e At e Bt ) * = 

(c) Show that if A, B e H„ then lim tNi0 | log A;(e A *e s ') = Xi(A+B) 
for i = 1, . . . ,n. 

10. Let A,Bg H„. 

(a) Assume that there exists a vector of length x x such that Ax ± = 
X l (A)x 1 ,Bx 1 = XjiB)^. Then 

(A + B) Xl = (\i(A) + A j (S))x 1 , e A e s Xl = e A ^ A )+ A ^ s ) Xl . 

The convexity of Ai(-) on H„ implies that Xi(A+ B) = X\{A) + 
\i(B). The inequalities 

e A l( A) + A l( s) < Al(e A eB) < CTl(e ^ e B) < f7l (e A ) (7l (e s ) = e x ^ +x ^ 

imply that we have equalities in the above inequalities. This 
show the equality holds for k = 1 in (5.5.7) if A and B have a 
common eigenvector corresponding to the first eigenvalue of A 
and B. 

11. Let C(t) be defined by (5.5.10). 

(a) Show that C(-t) = C(t) for any t ^ 0. 

(b) Show the equality (5.5.11). 

12. Let V be an n-dimensional inner product space over F = R,C, with 
the inner product (•,•). Let 

(5.5.23) S(U) := {u e U, (u, u) = i}, U is a subspace of V 

be the unit sphere in U. (S({0}) = 0.) For two subspaces of U, W C 
V the distance dist(U,V) is defined to be the Hausdorff distance 
between the unit spheres in U, W: 

dist(U, V) := max( max min llu — vll, max min llv — ull), 

u£S(U)vGS(V) vSS(V)uGS(U) 

(5.5.24) 

dist({0}, {0}) = o, dist({0}, W) = dist(W, {0}) = i if dim W > i. 

(a) Let dim U, dim V > l. Show that dist(U,V) < 2. Equality 
holds if either U n or U 1 - n V arc nontrivial subspaces. In 

particular dist(U, V) = 2 if dim U ^ dim V. 
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(b) Show that dist is a metric on U^j =0 Gr m (V). 

(c) Show that Gr m (V) is a compact connected space with respect to 
the metric dist(-, ■) for m = 0, 1 . . . , n. (I.e. for each sequence of 
TO-dimensional subspaces Uj, i £ N one can choose a subsequence 

£ N such that U^,j <G N converges in the metric dist to 
U £ Gr m (V). Hint: Choose an orthonormal basis in each Uj.) 

(d) Show that U^ =0 Gr m (V) is a compact space in the metric dist. 

(e) Let U,U; e Gr m (V),t £ N, l < m < n. Let P,Pi £ S(V) 
be the orthogonal projection on U,Uj respectively. Show that 
lim^oo dist(Uj, U) = o if and only if lim^oo Pj = P. 

(f) Let U 4 £ Gr m (V), W, e Grj(V), l < m,n and Uj _L W, for i £ 

N. Assume that liim^ dist(Uj, U) = o and dist((A m Uj) AW„X) = 

o for some subspaces U e Gr m (V),X £ Gr ; (A™ +1 V). Show 

that there exists W £ Gr(V) orthogonal to U such that lim^oo dist(Wj, W) = 

o. 

13. Let V be an n-dimensional inner product space over F = R,C, with 
the inner product (•, •). Let m € [1, ri-l]nN and assume that U, W £ 
Gr m (V). Choose orthonormal bases {u 1; . . . , u TO }, {w 1; . . . , w m } in 
U,W respectively. Show 

(a) det ((ui,Wj))% =1 = dct((w J -,u i ))^. =1 . 

(b) Let x 1; . . . , x m another orthonormal basis in U, i.e. Xi — Y^k=i Qki^k,i = 
l, . . . , m where Q = (q ki ) £ F mxm is orthogonal for F = R and 
unitary for F = C. Then det ((x if Wj))£- =1 = det Qdet ((u fe , Wj))^ i=1 . 

(c) [U,W] := |det ((uj, Wj))™- =1 | is independent of the choices of 
orthonormal bases in U,W. Furthermore [U,W] = [W,U]. 

(d) Fix an orthonormal basis in {w 1; . . . , w m } in W. Then there 
exists an orthonormal basis {u 1 ,...,u m } in U such that the 
matrix ((iij, Wj))™- =1 is upper triangular. Hint: Let Wj = 
span (wj +1 , . . . , w n ) for i = 1,. . . ,m— 1. Consider span (w 1 ) ± n 
U which has dimension to — 1 at least. Let \J 1 be an to — 1 
dimensional subspace of span (Wj) 1 n U. Let u x £ S(U) n U^". 
Use U 1; Vj to define an to — 2 dimensional subspace U 2 C U ± 
and u 2 £ S(Ui) fl U^ as above. Continue in this manner to find 
an orthonormal basis {u x , . . . , u m }. 

(e) [U,W] < l. ([U,W] is called the cosine of the angle between 
U,W.) 

(f) [U,V]=o <^=> U^V^O} <^=> UnV^{0}. Hint: 
Use (d) and induction. 
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14. Let V be an n-dimcnsional inner product space over F = R,C, 
with the inner product (-, •). Let l,m e [l,n] n N and assume that 
U e Gr ; (V),W e Gr m (V). Let P, Q e S(V) be the orthogonal 
projections on U, W respectively. Show 

(a) u + w = unw®un(un w) 1 - e w n (u n w)- 1 . 

(b) rank PQ = rank QP = rank PQP = rank QPQ. 

(c) rank PQ = dim W - dim WnU 1 , rank QP = dim U - dim U n 

15. Let V be an n-dimcnsional inner product space over F = R, C, with 
the inner product (•,•). Assume that V = ®' =1 Uj = ®™ =1 W i be two 
decompositions of V to nontirivial orthogonal subspaces: 

dim XJi = k- i = i,...,p, dim Wj = mj - mj-^, j = l, . . . , q, 
= lo < h < ■ ■ ■ < l p — n, = m < mi < . . . < m q = n. 

Let Qi,Rj <G S(V) be the orthogonal projections on U,,Wj re- 
spectively for i = l,...,p,j = l,...,q. Let := rankQ;Rj,i = 

i,---,p,j = i,---,q- 

Denote K := € {1, . . . ,p} x {1, . . . , q} : QiRj ^ 0}. For 

i e {1,.. . ,p},j £ {l,...,q} let 

Ji := {/ e {1, . . . ,q}, (i,f) e /C}, X, := {*' e {1, . . . (i'.j) G /C}. 
Show 

(a) ft = Ejej, Qi-fy) i = 1, . . . ,p. 

(b) Let (ii, ji), . . . , (i s , j s ) e /C and assume that i 7^ i&, ia ^ jb- 
Then 

s s s 

rank £ Q ia R ja = rank Q ia R ja )(^ Q ia R ja )* = 

a— 1 a— 1 a— 1 

s s s 

rank J^Qi a Rj a Qi a = rank Qi a Rj a Qi a = ^^rank Qi a Rj a . 

a— 1 a— 1 a— 1 

(c) rank P; < n iji where strict inequality may hold 

(d) Uj = J2jeji~Uij, where Ujj := PjWj, dim Ujj = for i = 
1, • • ■ ,f>, j = l,...,q. 

(e) Qj = Eier, p *Gj> j = 1, ■ • • , 9- 

(f) rank Qj < ^ i£l . riy, where strict inequality may hold. 

(g) Wj = Y.iei 3 W ji> wli crc W j; = Qj-U^dim W 3 -j = for j = 
l,...,q, i= l,...,p. 



Chapter 6 

Nonnegative matrices 



6.1 Graphs 



6.1.1 Undirected graphs 

An undirected graph is denoted by G = (V,E). It consists of vertices 
v € V, and edges which are unordered set of pairs (u, v), where u, v € V, 
and which are called edges of G. The set of edges in G is denoted 

by E. Let n = #V be the cardinality of V, i.e. V has n vertices. Then 
it is useful to identify V with (n) = {1, . . . , n}. For example, the graph 
G= ((4), {(1,2), (1,4), (2,3), (2,4), (3,4)}) has 4 vertices and 5 edges. 

In what follows we assume that G = (V, E) unless stated otherwise. A 
graph H = (W, F) is called a subgraph of G = (V, E) if W is a subset 
of V and any edge in F is an edge in E. Given a subset W of V then 
E(W) = {(u, v) € E,u,v £ W} is the set of edges in G induced by W. The 
graph G(W) :— (W,E(W)) is call the subgraph induced by W. Given a 
subset F of E, then V(F) is the set of vertices which consist of all vertices 
participating in the edges in F. The graph G(F) — (V(F),F) is called the 
subgraph induced by F. 

The degree of v, denoted by deg v is the number of edges that has v as 
its vertex. Since each edge has two different vertices 



where #E is the number of edges in E. v G V is called an isolated vertex 
if deg v = 0. Note that V(E) is the set of nonisolated vertices in G, and 
G(E) = (V(E), E) the subgraph of G obtained by deleting isolated vertices 
in G. 



(6.1.1) 




v£V 
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The complete graph on n vertices is the graph with all possible edges. It 
is denoted by K n = ((n),£ n ), where £ n = {(1,2), . . . , (l,n), (2,3), . . . , (n- 
l,n)}. For example, K 3 is called a triangle. Note that for any graph on n 
vertices G = ((n),E) is a a subgraph of K n , obtained by erasing some of 
edges in K n , but not the vertices! I.e. E C £ n . 

G = (V, E) is called biparite if V is a union of two disjoint sets of 
vertices V\ U V 2 so that each edge in E connects some vertex in V\ to some 
vertex in E 2 . Thus E C V\ x V 2 := {(«,«;),« € Vi,io e V 2 }. So any 
bipartite graph D = (V\ U V 2 ,E) is a subgraph of the complete bipartite 
graph Kv lt v 2 ■— (V\ U V 2 , V\ x V2). For positive integers l,m the complete 
bipartite graph on I, to vertices is denoted by K^ m := ((I) U (to), (I) x (to)). 
Note that Ki iTn has I + to vertices and lm edges. 

6.1.2 Directed graphs 

A directed graph is denoted by D = (V, E). V is the set of vertices and E is 
the set of directed edges in G. So E is a subset of Vx V = {(v, w),v,w £ V. 
Thus (v,w) € E is a directed edge from v to W. For example, the graph 
D = ((4), {(1,2), (2,1), (2, 3), (2,4), (3, 3), (3, 4), (4,1)}) has 4 vertices and 
7 (directed) edges. 

The directed edge (v, v) e E is called a Zoop, or selfloop. 

deg in v := #{(to, u) e deg out : w = #{(w, w) e £}, 

the number of edges to v and out of v in I?, deg in ,deg out are called the 
m or out degrees. Clearly we have the analog of (6.1.1) 

(6.1.2) Yl de S i* v = E de S out" = #E, 

vev vev 

A subgraph H = (W, F) of D — (V, E) is defined, and the induced sub- 
graphs D(W) = (W,E(W)),D(F) = {V{F),F) are defined as in §6.1.1. 
v e V is called isolated if deg in (v) — deg out (v) = 0. 

6.1.3 Multi graphs 

A multigraph is graph where multiple edges, in particular and multiple 
loops are allowed. So undirected multigraph G — (V, E) has undirected 
edges, which may be multiple, and may have multiple loops. A directed 
multigraph D = (V, E) may have multiple edges. 

Each directed multigraph D — (V, E) induces an undirected multigraph 
G{D) = (V, E'), where each directed edge (u, v) e E is viewed as undirected 
edge (u,v) e E'. (Each loop (u,u) € E will appear twice in E' .) Vice 
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versa, an undirected multigraph G = (V, E') induces a directed multigraph 
D{G) = (V, E), where each undirected edge (u, v) is (u, v) and (v, u), when 
u =/= v. The loop (u, u) appears p times in D(G) if it appears p times in G. 

Most of the following notions are the same for directed or undirected 
graphs or multigraphs, unless stated otherwise. We state them for directed 
multigraphs D — (Y,E). 

Definition 6.1.1 

1. A walk in D = (V, E) a given by v$vi ...v p , where (vi-i,Vi) £ E for 
i = 1, . . . ,p. One views it as a walk that starts at Vq and ends at v p . 
The length of the walk p, is the number of edges in the walk. 

2. A path is a walk where Vi ^ Vj for i =/= j . 

3. A closed walk is walk where v p = vq. 

4- A cycle is a closed walk where Vi ^ Vj for < i < j < p. A loop 
(v,v) G E is considered a cycle of length 1. Note that a closed walk 
vwv, where v ^ w, is considered as a cycle of length 2 in a digraph, 
but not a cycle in undirected multigraph! 

5. D is called a diforest if D does not have cycles. (An undirected multi- 
graph with no cycles is called forest. ) 

6. Let D — (V, E) be a diforest. Then the height of v € V , denoted by 
height (v) is the length of the longest path ending at v. 

7. Two vertices v,w £ V,v ^ w are called strongly connected if there 
exist two walks in D, the first starts at v and ends in w, and the second 
starts in w and ends in v. For undirected multigraphs G = (V,E) the 
corresponding notion is u,v are connected. 

8. A multidigraph D = ((n),E) is called strongly connected if either 
n = 1 and (1, 1) £ E, or n > 1 and any two vertices in D are strongly 
connected. 

9. A multigraph G = (V, E) is called connected if either n = 1, or n > 1 
and any two vertices in G are connected. (Note that a simple graph on 
one vertex G = ((1),0) is considered connected. The induced directed 
graph D(G) = G is not strongly connected.) 

10. Assume that a multidigraph D = (V,E) is strongly connected. Then 
D is called primitive if there exists k > 1 such that for any two 
vertices u, v e V there exists a walk of length k which connects u and 
v. For a primitive multidigraph D, the minimal such k is called the 
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index of primitivity, and denoted &yindprim(D). A strongly connected 
multidigraph which is not primitive is called imprimitive. 

11. For W C V, the multidirected subgraph D(W) = (W,E(W) is called 
a strongly connected component of D if D(W) is strongly connected, 
and for any W C U C V the induced subgraph D{U) = (U, E{U)) is 
not strongly connected. 

12. For W C V, the undirected subgraph G(W) = (W,E(W), of undi- 
rected multigraph G = (V,E), is called a connected component of G 
if G(W) is connected, and for any W ^ U C V the induced subgraph 
G{U) — (U,E(U)) is not connected. 

13. An undirected forest G = (V, E) is called a tree if it is connected. 

14-. A diforest D = (V,E) is called a ditree if the induced undirected 
multigraph G{D) is a tree. 

15. Let D = (V,E) be a multidigraph. The reduced (simple) digraph 
D r — (V r ,E r ) is defined as follows. Let D(Vi),i — l,...,k be all 
strongly connected components of D. Let Vq = V r \(U* =1 \ / i be all ver- 
tices in D which do not belong to any of strongly connected compo- 
nents of D. (It is possible that either Vq is an empty set or k ~0, i.e 
D does not have connected components, and the two conditions are 
mutually exclusive.) Then V r = (U ve v { v }) uf =1 {Vi}, i.e. V r is the 
set of all vertices in V which do not belong to any connected compo- 
nent and the new k vertices named {V\}, . . . , {Vk}. A vertex u' e V r 
is viewed as either a set consisting of one vertex v G V or the set Vi 
for some i = 1, . . . , k. Then E r does not contain loops. Furthermore 
(s,t) G E r , if there exists an edge from (u,v) £ E, where u and v are 
in the set of vertices represented by s and t in V, respectively. 

16. Two multidigraphs D\ = (Vi, Ei), D2 = (V^-E^) are called isomor- 
phic if there exists a bisection <f> : V\ — * Vi which induces a bijection 
<f> : E\ — > E2. That is if (u\,Vi) £ Ex is a diedge of multiplicity k 
in Ei then ((j)(u\), 4>(v\)) £ E 2 is a diedge of multiplicity k and vice 
versa. 



Proposition 6.1.2 Let G — (V, E) be a multigraph. Then G is a dis- 
joint union of its connected components. That is, there is a unique de- 
composition of V to U^ =1 Vi, up to relabeling of V\, . . . , Vfe, such that the 
following conditions hold: 

1. V\, . . . , Vk are nonempty and mutually disjoint. 
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2. Each G(Vi) — (Vi,E(Vi)) is a connected component ofG. 

3. E=^ =1 Vi. 

Proof. We introduce the following relation ~ on V. First, we assume 
that v ~ v for each v eV. Second, for v, w G V, v ^ w we say that v ~ w if 
v is connected to w. It is straightforward to show that ~ is an equivalence 
relation. Let 14, . . . , 14 be the equivalence classes in V. That is v, w G Vj 
if and only if i; and w are connected. The rest of the proposition follows 
straightforward. □ 

Proposition 6.1.3 Let D = (E, V) be a multidigraph. Then the re- 
duced digraph D r is a diforest. 

See Problem 6.1.5.4 for proof. 

Proposition 6.1.4 Let D = (V, E) be a multidigraph. Then D is di- 
forest if and only if it is isomorphic to a digraph D\ = ((n),E\) such that 
if S Ei then i < j. 

Proof. Clearly, the graph in D\ can not have a cycle. So if D is iso- 
morphic to Di then D is a diforest. Assume now that D = (V,E) is a 
diforest. Let Vi be all vertices in V having height i for i = 0, . . . , k > 0, 
where k is the maximal height of all vertices in D. Observe that from the 
definition of height it follows that if (u,v) € D, where u e Vi, w e Vj then 
i < j. Rename the vertices of V such that Vj = {ni + 1, . . . , n^+i} where 
= uq < m < . . . < nk+i = n := #V. Then one obtains the isomorphic 
graph Di = ((n),Ei, such that if G E\ then i < j. □ 

Theorem 6.1.5 Let D = (V, E) be as strongly connected multidigraph. 
Let £ be the g.c.d, (the greatest common divisor), of lengths of all cycles in 
D. Then exactly one of the following conditions hold. 

1. I = 1. Then D is primitive. Let s be the length of the shortest cycle 
in D. Then indprim(D) < #V + s(#V - 2). 

2. I > 1. Then D is imprimitive. Furthermore, it is possible to divide 
V to £ disjoint nonempty subsets 14, • ■ • , Vi such E C uf =1 Vj x V i+ \, 
where Ve+\ :=V\. 

Define Di = (Vi,Ei) to be the following digraph. (v,w) G Ei if there 
is a path or cycle of length £ from v to w in D, for i = 1, . . . ,1. Then 
each Di is strongly connected and irreducible. 
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The proof of this theorem will be given later using the Perron- Frobcnius 
theorem. (Obviously, one can give a pure graph theoretical proof of this 
theorem.) If D is a strongly connected imprimitive multidigraph, then I > 1 
given in (2) is called the index of imprimitivity of D. 

6.1.4 Matrices and graphs 

For a set S denote by S mxn the set of all m x n matrices A = [ay]™'™ =1 
where each entry ay is in S. Then A T £ 5" xra the transposed matrix 
of A. Denote by F any field, and by K, C the field of real and complex 
numbers respectively. By S n (S) C S nxn denote the set of all symmetric 
matrices A — [ay], ay = aji with entries in S. Assume that £ S. Then 
by S ni o(<5) C S n (<S) the subset of all symmetric matrices with entries in S 
and zero diagonal. Denote by V n C {0,1}™ X " the group of permutation 
matrices. I.e. each P £ V n has one 1 in each row and column, and all 
other n 2 — n entries are zero. Denote by 1 — (l, . . . , l) £ R n the vector 
of length n whose all coordinates arc 1. For A = [ay] £ £ nxn we denote 
by tr A := Y^7=i a u tn e trace of A. For any t £ R, we let sign t = if 
t = and sign t = t|t if t ^ 0. For A, B e R mxn we denote B - A > 
0,B — A > 0, B — A>0if£> — A is a nonnegative matrix, a nonnegative 
nonzero matrix, and a positive matrix respectively. 

Let D = (V, E) be a digraph. Assume that #V = n and label the 
vertices of V as 1, . . . , n. So we have a bijection <f>i : V —* {n) . This bijection 
induces an isomorphic graph D\ = ((n),Ei). With D\ we associate the 
following matrix A{D\) = [ay]"j = i € Z™ x ". So ay is the number of 
directed edges from the vertex i t the vertex j. (If ay = then there no 
diedges from i to j.) When no confusion arises we let A(D) := A(Di), and 
we call A(D) a representation matrix of D. Note that a different bijection 
02 : V — > (n) gives rise to a different A(£> 2 ), where A{D 2 ) = P T A(D 1 )P 
for some permutation matrix P e V n . See Problem 7. 

If D is a simple digraph then A(D) £ {0, l} nx ™. If G is a multigraph, 
then A(G) = A(D) where D is is the induced digraph by G. Hence A(G) £ 
S„(Z+). If G is a graph then A(G) £ S n , ({0, 1}). 

Proposition 6.1.6 Let D = (V, E) be a multidigraph on n vertices. Let 
A(D) be a representation matrix of D. For an integer k > 1 let A(D) k = 
[fly^] € Z" x ™. Then is the number of walks of length k from the vertex 
i to the vertex j. In particular, 1 T Al and tr A are the total number of 
walks and the total number of closed walks of length k in D. 

Proof. For k = 1 the proposition is obvious. Assume that k > 1. 
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Recall that 

(6.1.3) a\f = a iil a ili2 ...a ik _ lj . 

ii,...,ik-i£(n) 

The summand an 1 a il i 2 ...a ik _ 1 j gives the number of walks of the form 
io«i«2 • ■ - ik~\ik, where i = = j. Indeed if one of the terms in this 
product is zero, i.e. the is no diedge (i p , i p +\) then the product is zero. Oth- 
erwise each positive integer a ipip+1 counts the number of diedges (i p ,i p+ i). 
Hence a iil ai 1 i 2 . . . di k _ 1 j is the number of walks of the form io«i«2 ■ • ■ ik-iik- 
The total number of walks from i = iq to j = ik of length k is the sum 
given by (6.1.3). To find out the total number of walks in D of length k is 
J2i=j=i a ij^ — l T ^4l- The total number of closed walks in D of length k is 
Y,Uaf=tvA{D)K □ 

With a multibipartite graph G = (ViUV2,E), where #Vi — m, #V 2 = n, 
we associate a representation matrix B(G) = [&ij]™'J =1 as follows. Let 
ipi : Vi — » (m),«^i : Vi — > (m) be bijections. This bijection induces an 
isomorphic graph D\ = ((m) U (n),E\). Then bij is the number of edges 
connecting i e (m) to j e (n) in D\ . 

A nonnegative matrix A = [aij]f = j =1 E M" x " induces the following 
digraph D(A) = ((n),E). The diedge is in E if and only if a^- > 0. 
Note that of A(D(A)) = [sign ay] e {0,l} nxn . We have the following 
definitions. 

Definition 6.1.7 

1. A = [dij] G R nxn is combinatorially symmetric i/sign ay = sign ajj 
fori,j = l,...,n. 

2. ie M" xn is irreducible, if D(A) is strongly connected. 

3. A e M™ x ™ is primitive if A h is a positive matrix for some integer 
k > 1. 

^. Assume that A € K" x ™ is primitive. Then the smallest positive inte- 
ger k such that A k > is called the index of primitivity of A, and is 
denoted by indprim(A) . 

5. A e ]R" X " is imprimitive if A is irreducible but not primitive. 

Proposition 6.1.8 Let D = ((n),E) be a multidigraph. Then D is 
strongly connected if and only if (I + A(D)) n ~ 1 > 0. in particular, a 
nonnegative matrix A e IR™ X ™ is irreducible if and only if (I + A) 11 ^ 1 > 0. 
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Proof. Apply the Newton binomial theorem for (l + t) n 1 to the matrix 
(I + AiD))™- 1 

n-1 / _ i\ 

(I + A(D)r-i = J2( v )A{DY. 

P =o \ P / 

Recall that all the binomial coefficients (™~ 1 ) are positive for p = 0, . . . , n — 
1. Assume first that (7 + A(D)) n ^ 1 > 0. That is for any i, j E (n) the 
entry of (I + A{D)) 11 ^ 1 is positive. Hence the entry of A(D) P is 
positive for some p = p(i,j). Let i ^ j. Since A(D)° = I, we deduce that 
p(i,j) > 0. Use Proposition 6.1.6 to deduce that there is a walk of length 
p from the vertex i to the vertex j. 

Suppose that D is strongly connected. Then for each i ^ j we must 
have a path of length p e [l,n — 1] which connects i and j, see Problem 
1. Hence all off-diagonal entries of (I + A{D)) n ^ 1 are positive. Clearly, 
(I + A(D)) n ~ 1 > I. Hence (I + A(D))"^ 1 > 0. 

Let A e M" x ". Then the entry of (I + A) 11 - 1 is positive if and only 
if the entry of (I + A(D(A))) n ~ 1 is positive. Hence A is irreducible if 
and only if (I + A)™" 1 > 0. □ 



6.1.5 Problems 

1. Assume v\ . . . v p is a walk in D = (V,E). Show that it is possible 
to subdivide this walk to walks v ni _ 1+ i . . . v ni , i — l,...,q, where 
n Q = < ni < ... < n q = p, and each walk is either a cycle, or a 
maximal path. 

Erase all cycles in v\ . . . v p and apply the above statement to the new 
walk. Conclude that a walk can be "decomposed" to a union of cycles 
and at most one path, item Let D be a digraph. Assume that there 
exists a walk from v to w. Show that 

• if v ^ w then there exists a path from u to v of length #V — 1 
at most; 

• if v — w there exists a cycle which which contains v, of lcnght 
#y at most. 

2. Let G = (V, E) be a multigraph. Show that the following are equiva- 
lent. 

• G is bipartite; 

• all cycles in G have even length; 
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• G is imprimitivc. 

3. Let D = (V,E) be a directed multigraph. Assume that the reduced 
graph D r of D has two vertices. List all all possible D r up to the 
isomorphism, and describe the structure of all possible corresponding 
D. 

4. Prove Proposition 6.1.3. 

5. Let A(D) G Z" xn be the representation matrix of the multidigraph 
D = ((n), E). Show that A(D) + A(D) T is the representation matrix 
of the undirected multigraph G(D) = ((n),E') induced by D. 

6. Let G — {{n),E') be an undirected multigraph, with the representa- 
tion matrix A{G) G S n (Z + ). Show that A{G) is the representation 
matrix of the induced directed multigraph D(G). In particular, if G 
is (simple) graph, then D(G) is a (simple) graph with no loops. 

7. Let D = (V,E),Di — (Vi,Ei) be two multidigraphs with the same 
number of vertices. Show that D and D\ are isomorphic if and only 
if A(Di) = P T A(D)P for some permutation matrix. 

8. Let G = (Vi U V2,E) be a bipartite multigraph. Assume that #Vf = 
m >#^2 = n an d B(G) e Z" ixn is a representation matrix of G. 
Show that a full representation matrix of G is of the form A(G) = 

Omxm B(G) 
B(G) T nxn 

6.2 Perron-Frobenius theorem 

The aim of this section to prove the Perron-Frobenius theorem. 
Theorem 6.2.1 Let A G M" x " be an irreducible matrix. Then 

1. The spectral radius of A, p(A), is a positive eigenvalue of A. 

2. p(A) is an algebraically simple eigenvalue of A. 

3. To p(A) corresponds a positive eigenvector < u G R n , i.e. Au = 
p(A)u. 

4- All other eigenvalues of X of A satisfy the inequality |A| < p(A) if and 
only if A is primitive, i.e. A k > for some integer k>l. 



290 



CHAPTER 6. NONNEGATIVE MATRICES 



5. Assume that A is imprimitive, i.e. not primitive. If n = 1 then 
A = Oixi- Assume that n > 1. Then there exists exactly h — 1 > 1 
distinct eigenvalues Ai,...,A/j_i different from p{A) and satisfying 
| Ai| = p(A). Furthermore, the following conditions hold. 

(a) \ is an algebraically simple eigenvalue of A for i = 1, ... ,h — l. 

(b) The complex numbers -^s, i = 1, • • • , h — 1 and 1 are all h roots 

of unity, i.e. \i — p(A)e fori = 1, . . . , ft— 1. Furthermore, 

if Azi = XiZi,Zi ^ iften |zj| = u > o, the Perron- Frobenius 
eigenvector u given in 3. 

(c) Let ( be any h-root of 1, i.e. ( h = 1. Tften £fte matrix QA is 
similar to A. Hence, if A is an eigenvalue of A then Q\ is an 
eigenvalue of A having the same algebraic and geometric multi- 
plicity as A. 

(d) There exists a permutation matrix P e V n such that P T AP = B 
has a block h-circulant form 

B12 ... 
B 23 ... 

U 

: B (h _ uh 
B hl : 

Bi(i+i) € M" lX " l+1 ,i = l,...,ft,B /l(h+ i ) = B hl ,n h+1 = m,m + ... + n h = n. 

Furthermore, the diagonal blocks of B h are all irreducible prim- 
itive matrices, i.e. 
(6-2.1) 

C, := • • • . . . B(t-i)» € R™* x ™*, i = 1, . . . , ft, 

are irreducible and primitive. 

Our proof follows closely the proof of H. Wielandt [Wie50]. For a non- 
negative matrix A = [a^] e M™ x ™ define 

(6.2.2) r(x) := min ^ x -^ ; where x = (x ± , x n ) T > 0. 

It is straightforward to show, e.g. Problem 1, that 

(6.2.3) r(x) = max{s > o, sx < Ax}. 
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Theorem 6.2.2 (Wielandt's characterization) Let A = [a l3 ] <G K+ x ™ 
be irreducible. Then 

(Ax)i 

(6.2.4) maxr(x) = max min = p(A) > o. 

x>0 x=(a: 1 ,...,x n ) T >0 x z >o X{ 

The maximum in the above characterization is achieved exactly for all x > 
of the form x = cm, where a > and u = (u ± , . . . ,u n ) T > is the 
unique positive probability vector, i.e. YH=i u i = 1> satisfying Au = p(A)u. 
Moreover, p(A) is a geometrically simple eigenvalue. 

Proof. Let r(A) := sup x>0 r(x). So r(A) > r(l) = mixij J2j=± a ij- 
Since an irreducible A can not have a zero row, e.g. Problem 2, it follows 
that r(A) > r(l) > o. 

Denote by 

n 

(6.2.5) n„ := {(x u ...,x n ) T >0,J2 x i = i}. 

Z— 1 

the convex set of probability vectors in R". Note that n„ is a compact 
set in R", i.e. from any sequence Xj, j = l, . . ., we can find a subsequence 
Xj^ , x J= , . . . which converges to x € 7T„. 

Clearly, for any x > and a > we have r(ox) = r(x). Hence 

(6.2.6) r(A) = sup r(x) = sup r(x). 

x>0 x£/7„ 

Since A is irreducible, (/ + A) 11 ^ 1 > 0. Hence for any x e 77„ y = (7 + 
A) n ^ 1 x > 0. (Sec Problem 3a.) As r(y) is a continuous function on 
(I + J 4)" _1 n„, it follows that r(y) achieves its maximum on (I + A)" _1 n„ 

ri(A) := max = r(v), for some v in (I + A) n ~ 1 TI n . 

ye(/+A)"-i/7„ 

r(A) is defined as the supremum of r(x) on the set of all x > it follows 
that r(A) > r\(A). We now show the reversed inequality r(A) < r\{A) 
which is equivalent to r(x) < r ± (A) for any x > 0. 
One has the basic inequality 

(6.2.7) r(x) < r((7 + A) n_1 x), x > 0, with equality iff Ax = r(x)x, 

see Problem 3d. For x G 7T„ we have r(x) < r((I + ^4)" _1 x) < r ± (A). In 
view of (6.2.6) we have r(A) < r\(A). Hence r(A) = r\{A). 

Suppose that r(x) = r(A),x > 0. Then the definition of r(A) (6.2.6) 
and (6.2.7) yields that r(x) = r((7 + A) n_1 x). The equality case in (6.2.7) 
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yields that Ax = r(A)x. Hence (1 + r(A))"- 1 x = (I + A)" _1 x > 0, 
which yields that x is a positive eigenvector corresponding to the eigenvalue 
r(A). So x = ttu,a > o for the corresponding probability eigenvector 
u = K,..., u n ) T , Au = r(A)u. 

Suppose that r(z) = r(A) for some vector z = (z l7 . . . , z„) T > 0. So 
z > o and Az = r(A)z. Let b = mini — ■ We claim that z = feu. Otherwise 
w := z — feu > 0, w has at least one coordinate equal to zero, and Aw = 
r(A)w . So r(w) = r(A). This is impossible since we showed above that 
w > 0! Hence z = feu. Assume now that y £ K™ is an eigenvector of A 
corresponding to r(A). So Ay = r(A)y. There exists a big positive number 
c such that z = y + cu > 0. Clearly Az = r(A)z. Hence r(z) = r(A) and we 
showed above that z = feu. So y = (6 — c)u. Hence r(A) is a geometrically 
simple eigenvalue of A. 

We now show that r(A) = p(A). Let A ^ r(A) be another eigenvalue of 
A, which may be complex valued. Then 

n 

(Az); = \zi = (Az)i = y~]ajjZj, i = i,...,n, 
j'=i 

where ^ z = (z lt . . . , z n ) T £ C" is the corresponding eigenvector of 
A. Take the absolute values in the above equality, and use the triangle 
inequality, and the fact that A is nonncgative matrix to obtain 

n 

|A| \zi\ < ^2aij\zj\, i = l,...,n. 

3 = 1 

Let |z| := (\z ± \, . . . , |2: n |) T ^ 0. Then the above inequality is equivalent to 
|A| |z| < A\z\. Use (6.2.3) to deduce that |A| < r(|z|). Since r(|z|) < r(A) 
we deduce that |A| < r(A). Hence p(A) = r(A), which yields (6.2.4). □ 

Lemma 6.2.3 Let A £ M" x ™ fee an irreducible matrix. Then p(A) is 
an algebraically simple eigenvalue. 

Proof. (For all the notions and results used here see §??.) Theorem 
6.2.2 implies that p(A) is geometrically simple, i.e. nul (p(A)I — A) = 1. 
Hence rank (p(A)I — A) = n — 1. Hence adj (p(A)I — A) = tuv T , where 
Au = p(A)u, A T v = p(A)v,u,v > and ^ t £ R. Note that uv T is a 
positive matrix, hence truv T = v T u > o. Since 

(det (XI - A))'(\ = p{A)) = tradj (p(A)I - A) = t(v T u) ^ o, 

we deduce that p(A) is a simple root of the characteristic polynomial of A. 

□ 
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As usual, denote by S 1 := {z e C, |z| = 1} the unit circle in the complex 
plane 

Lemma 6.2.4 Let A E E^ x ™ be irreducible, C G C nxn . Assume that 
\C\ < A. Then p(C) < p(A). Equality holds, i.e. there exists A e spec C, 
such that A = Cp(^) f or some £ G S , if and only if there exists a complex 
diagonal matrix D e C nxn , whose diagonal entries are equal to 1, such that 
C = (DAD^ 1 . The matrix D is unique up to a multiplication by t € S 1 . 

Proof. Assume that A = [dij], C — [c^]. Let z = (z l7 . . . , z n ) T ^ be 
an eigenvector of C corresponding to an eigenvalue A, i.e. Az = Cz. The 
arguments of the proof of Theorem 6.2.2 yield that |A| |z| < |C| |z|. Hence 
|A| |z| < \A\ |z|, which implies that |A| < r(|z|) < r(A) = p(A). 

Suppose that p(C) = p(A). So there exists A € spec C, such that |A| = 
p(A). So A = (p(A) for some ( e S 1 . Furthermore, for the corresponding 
eigenvector z we have the equalities 

|A| |z| = |Cz| = |C| |z|=A|z|=r(A)|z|. 

Theorem 6.2.2 yields that |z| is a positive vector. Let Zi = dj|zj|, \di\ = l 
for i = l,...,n. The equality \Cz\ — \C\ |z| = A\z\ combined with the 
triangle inequality and |C| < A, yields first that |C| = A. Furthermore for 
each fixed i the nonzero complex numbers CnZi, . . . , Ci n z n have the same 
argument, i.e. Cij = Qaijdj for j = 1, . . . ,n and some complex number Q, 
where = 1. Recall that Xzi — (Cz)j. Hence Q = (di for i = 1, . . . , n. 
Thus C = QDAD^ 1 , where D = diag(di, . . . ,d n ). It is straightforward to 
see that D is unique up tD for any t £ S 1 . 

Suppose now that for D = diag(<ii, . . . , d n ), where — . . . = \d n \ = 1 
and |C| = 1 we have that C = (DAD^ 1 . Then \(C) = C,\(A), see Fact 
(??.??). So p{C) = p{A). Furthermore Cjj = (diCijdj,i, j = 1, . . . ,n. So 
\C\=A. ' ' □ 

Lemma 6.2.5 Let £i, . . . , (h € S 1 &e ft- distinct complex numbers which 
form a multiplicative semi-group, i.e. for any integers i,j G [l,h] CiCj € 
{Ci, • • • , Oi}- Then the set {Ci, • ■ ■ , Oi} * s the set, (the group), of all h roots 
ofl:e h , i = 1, . . . , h. 

Proof. Let ( £ T := {£i, . . . (h}. Consider the sequence C,i — 1, 

Since ( t+1 — for i — 1, . . . , and T is a semigroup, it follows that each ( l 
is in T. Since T is a finite set, we must have two positive integers such that 
( k = ( l for k < I. Assume that k and I are the smallest possible positive 
integers. So C p = 1, where p = l-k > 1, and T p := {<, ( 2 , . . . , C p ~\ ( p = 1} 
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arc all p roots of 1. £ is called a p-primitivc root of 1. I.e. C = e v 
where p\ is an positive integer less than p. Furthermore p\ and p are 
coprime, which is denoted by (pi,p) — 1. Note that ( l &T for any integer 

Next we choose ( e T, such that £ is a primitive p-root of 1 of the 
maximal possible order. We claim that p = h, which is equivalent to the 
equality T — T p . Assume to the contrary that T p CT. Let rj e T\T p . The 
previous arguments show that rj — is a q-primitive root of 1. So T q C T, 
and T q C-T p . So g can not divide p. Also the maximality of p yields that 
q < p. Let (p, q) = r be the g.c.d., the greatest common divisor of p and q. 
So 1 < r < q. Recall that Euclid algorithm, which is applied to the division 
of p by q with a residue, yields that there exists two integers i,j such that 
W + JQ = r - Let I := > p be the least common multiplier of p and g. 

Observe that (' = eT p ,r]' = e T q . So 

£ := (,/)*(£')* = = e^ 1 E T. 

As £ is an /-primitive root of 1, we obtain a contradiction to the maximality 
of p. So p = h and T is the set of all /i-roots of unity. □ 

Lemma 6.2.6 Le£ A e M" xn 6e irreducible, and assume that for a 
positive integer h > 2, A has h — 1 distinct eigenvalues Ai, . . . , A/j_i, which 
are distinct from p(A), such that |Ai| = ... = |A^_i| = p(A). Then the 
conditions (5a-5c) of Theorem 6.2.1 hold. Moreover, P T AP = B, where B 
is of the form given in (5d) and P is a permutation matrix. 

Proof. Assume that Q := e S 1 for i = 1, . . . , h — 1 and Qh = 

1. Apply Lemma 6.2.4 to C = A and A = Qp(A) to deduce that A = 
QDiAD' 1 where Di is a diagonal matrix such that \D\ = I for i = 1, . . . , h. 
Hence, if A is an eigenvalue of A then is an eigenvalue of A, with an 
algebraic and geometrical multiplicity as A. In particular, since p{A) is an 
algebraically simple eigenvalue of A, \ = Cip(A) is an algebraically simple 
of A for i = 1, . . . , h — 1. This establish (5a). 

Let T = {Ci, . . . , C4- Note that 
(6.2.8) 

A - QDiADr 1 = „/),,:.,/),. 1 /), ',/;, 1 = (QQ(D i D j )A(D i D j )~ 1 . 

So CiCiP(^) i s an eigenvalue of A. Hence e T, i.e. T is a semigroup. 
Lemma 6.2.5 yields that {(i,...,£„} are all ft. roots of 1. Note that if 
Azi = XiZi,Zi ^ 0, then = iD^u for some ^ i € C, where u > is 
the Perron- Frobenius vector given in Theorem 6.2.1. This establish (5b) of 
Theorem 6.2.1. 
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Let C = e 2 "^ 1 E T. Then A = (DAD^ 1 , where D is a diagonal 
matrix D = (d\, . . . , d n ), \D\ = I. Since D can be replaced by d\D, we can 
assume that di = 1. (6.2.8) yields that A = ( h D h AD- h = IAI~ l . Lemma 
6.2.4 yields that D h = diag(d^\ . . . , d h n ) = tl. Since di = 1 it follows that 
D h — I. So all the diagonal entries of D are ft-roots of unity. Let P e P„ 
be a permutation matrix such that the diagonal matrix E = P T DP is of 
the following block diagonal form 

E = I ni ®V-\In 2 ®- ■ .®(J,i-iI ni ,IM = e h ,i = 1, . . . ,1-1, 1 < ki < k 2 < . . . < < h-1. 

Note that / < h and equality holds if and only if ki = i. Let ^0 = 1. 

Let B = P T AP. Partition B to a block matrix - =1 where B,j € 

if"' for i,j = I,..., I. Then the equality A = (DAD- 1 yields B = 
QEBE^ 1 . The structure of B and E implies the equalities 

Bij Q Bij , i, j — 1 , . . . , Z . 

Mj-i 

Since all the entries of B^ are nonnegative we obtain that B^ = if 
Ctt 111 7^ 1- Hence Bu = for i = 1, . . . , I Since B is irreducible it follows 

l^j — 1 

that not all B^i, . . . , B^ are zero matrices for each i = 1, . . . , I. First start 
with i = 1. Since /^o = 1 an d ji > 1 it follows that /ij 7^ £ for j > 1. So 
Bij = for j = 3, ... , Z. Hence Bi 2 7^ 0, which implies that \i\ — (, i.e. 
fci = 1. Now let z = 2 and consider j = 1, . . . , I. As fcj € [k\ + 1, ft — 1] 
for i > 1, it follows that B 2 j = for j 7^ 3. Hence B 23 7^ which yields 
that k 2 = 2. Applying these arguments for i = 3, — 1 we deduce that 
EV,- = for j ^ i + 1, 7^ 0, fcj = i for i = 1, . . . , / — 1. It is left to 

consider i — I. Note that 

= — — — r = ^-W-i) which is different from 1 for j e [2,Z1. 

Hence By = for j > 1. Since B is irreducible, Bu / 0. So ( l = 1. Since 
Z < ft, we deduce that Z = ft. Hence B has the block form given in (5d). □ 

Proposition 6.2.7 Let A e M" xn be irreducible. Suppose that < 
w e R" is an eigenvector of A, i.e. Aw = Aw. Tften A = p(A) and w > 0. 

Proof. Let v > be the Perron- Frobenius vector of A T , i.e. A T v = 
p(A)v. Then 

v T Aw = v T (Aw) = p(A)v T w => (p(A) - A)v T w = o. 
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If p(A) / Awe deduce that v T w = o, which is impossible, since v > and 
w > 0. Hence A = p(A). Then w is the Perron- Frobenius eigenvector and 
w > 0. □ 



Lemma 6.2.8 Let A e M" x be irreducible. Then A is primitive if and 
only if each eigenvalue X of A different from p(A) satisfies the inequality 
|A| < p(A). I.e. condition (4) of Theorem 6.2.2 holds. 

Proof. By considering B = ^^AyA it is enough to consider the case 
p(A) = 1. Assume first that if A ^ 1 is an eigenvalue of A then |A| < 1. 
Theorem 6.2.2 implies Au — u, A T w = w for some u, w > 0. So w T u > o. 
Let v := (w T u) _1 w. Then A T v = v and v T u = l. Fact ?? yields that 
linifc^oo A k = uv T > o. So there exists integer k > 1, such that A k > 
for k > kg, i.e. A is primitive. 

Assume now A is has exactly h > 1 distinct eigenvalues A satisfying 
|A| = 1. Lemma 6.2.6 implies that there exists a permutation matrix P 
such that B = P T AP is of the form (5d) of Theorem 6.2.1. Note that B h 
is a block diagonal matrix. Hence B h i = (B h y is a block diagonal matrix 
for j = 1, . . . , .... Hence, B h i is never a positive matrix, so A h i is never a 
positive matrix. In view of Problem 4, A is not primitive. □ 



Lemma 6.2.9 Let B € K" xn be an irreducible, imprimitive matrix, 
having h > 1 distinct eigenvalues A satisfying |A| = p(B). Suppose fur- 
thermore that B has the form (5d) of Theorem 6.2.1. Then B h is a block 
diagonal matrix, where each diagonal block is an irreducible primitive ma- 
trix whose spectral radius is p(B) h . In particular, the last claim of (5d) of 
Theorem 6.2.1 holds. 

Proof. Let D(B) = ((n),E) be the digraph associated with B. Let 
Po = 0,pi= po+ni,. ..,Ph= Ph-i+nh = n. Denote V t = {pi-i + 1, ...,Pi} 
for i = l,...,h, and let V h+1 := Vi- So (n) = U^Vj. The form of B 
implies that E C u' l =1 I^ x Vi + \. Thus, any walk that connects vertices 
j, k 6 Vi must be divisible by h. Observe next that B h — diag(Ci, . . . , Ch), 

where C x = [c$]£ k=1 , . . . ,C h = [c^]]t k =i are defincd in C 6 - 2 - 1 )- Let 
D(Ci) — (Vi, Ei) be the digraph associated with Cj for i = 1, . . . , h. Then 

there exists a path of length h from j to k in Vi if and only if > 0. 
Since B is irreducible, D(B) is strongly connected. Hence, each D(d) is 
strongly connected. Thus, each Ci is irreducible. 

Recall that Bu — p(B)u for the Perron- Frobenius vector u T = (u^, . . . , u 
T ,u 4 e E"',i = i,...,h. Thus, B h u = p(B) h u, which implies that 
CiVLi = p(B) h Ui,i = i,...,h. Since > Proposition 6.2.7 yields that 
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p{Ci) — p(B) h ,i = 1, . . . , h. Recall that the eigenvalues of B h are the h 
power of the eigenvalues of B, i.e. X(B) = (A^, . . . , Ajj), where X(B h ) = 

(Ai,...,A„). Furthermore, B has h simple eigenvalues p(B)e R , i = 
l,...,h with |A| = p(B), and all other eigenvalues satisfy |A| < p(B). 
Hence B h has one eigenvalues p(B) h of an algebraic multiplicity h and all 
other eigenvalues p satisfy \p\ < p(B) h . 

Since B h = diag(C l7 . . . , C h ), we deduce that X(B h ) = (A(Ci), . . . , \{C h )). 
As Ci is irreducible and p(Ci) = p(B) h , we deduce that all other eigenval- 
ues p of Ci satisfy \p\ < p(Ci). Lemma 6.2.8 yields that d is primitive. □ 

Problems 

1. Prove equality (6.2.3). 

2. Show that if A £ M™ x ™ is irreducible then can not have a zero row or 
column. 

3. Assume that A £ M" x " is irreducible. Show 

(a) For each x £ il„ the vector (/ + A)" _1 x is positive. 

(b) The set (I + A) n_1 II„ := {y = (I + ^4) n-1 x,x £ 77„} is a 
compact set of positive vectors. 

(c) Show that r(y) is a continuous function on (/ + J 4)" _1 II„. 

(d) Show (6.2.7). Hint: use that (^+/)"- 1 (Ax-r(x)x) is a positive 
vector, unless Ax = r(x)x. 

4. Assume that A £ M™ xrl is irreducible. Show the following are equiv- 
alent 

(a) A is primitive. 

(b) There exists a positive integer fco such that for any integer k > fc 
A k > 0. 

5. Let D = ((h), E) be the cycle l-»2-»...-»/i-l-»/i-»l. 

(a) Show that representation matrix A(D) is a permutation matrix, 
which has the form of B given in (5d) of Theorem 6.2.1, where 
each nonzero block is 1 x 1 matrix [1]. A(D) is called a circulant 
matrix. 

(b) Find all the eigenvalues and the corresponding eigenvectors of 
A(D). 
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6. Let the assumptions of Lemma 6.2.9. Assume the notation of the 
proof of Lemma 6.2.9. 

(a) Show that the length of any closed walk in D{B) is divisible by 
h. 

(b) Show that a length of any walk from a vertex in Vi to a vertex 
Vj, such that 1 < i < j < h, minus j — i is divisible by h. 

(c) What can you say on a length of any walk from a vertex in Vi 
to a vertex Vj , such that 1 < j < i < hi 

(d) Show that each d is irreducible. 

7. Let D = (V, E) be a digraph and assume that V is a disjoint union 
of h nonempty sets V\,...,Vh- Denote Vh+i '■= V\. Assume that 
E C Ui =1 Vi x V i+1 . Let Di = (V t ,E t ) be the following digraph. The 
diedge (v, w) G £7j, if there is a path of length h in D from v to w in 
D. 

(a) Show that D is strongly connected if and only if Di is strongly 
connected for i = 1, . . . , h. 

(b) Assume that D is strongly connected. Let 1 < i < j < h. Then 
Di is primitive if and only if Dj is primitive. 

8. Let B e M" x ™ be a block matrix of the form given in (5d) of Theorem 
6.2.1. 

(a) Show that B h is a block diagonal matrix diag(Ci, . . . , Cu), where 
Ci is given (6.2.1). 

(b) Show that B is irreducible if and only if d is irreducible for 
i = l,...,h. 

(c) Assume that B is irreducible. 

i. Let 1 < i < j < h. Then Ci is primitive if and only if Cj is 
primitive. 

ii. B has h distinct eigenvalues on the circle \z\ — p(B) if and 
only if some Ci is primitive. 

9. Assume the assumptions of Lemma 6.2.6. Let Au = p(A)u, u = 

. . . , u n ) T > 0. Assume that r\ is an /i-root of unity, and suppose 
that Az = r]z,z = (z l7 . . . , z n ), such that |z| = u. Assume that 
Zi = Ui for a given i G (n). (This is always possible by considering 
ll^z.) Show Zj — rj k ^Uj, for a suitable integer k(j), for j = 1, . . . ,n. 
Furthermore, given an integer k then there exists j G (n) such that 

Z j =T] k U j . 
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Hint: Use the proof of Lemma 6.2.6. 

10. Let B e ]R" X ™ be an irreducible block matrix of the form given in 
(5d) of Theorem 6.2.1. Let Ci, . . . , Ch be defined in (6.2.1). Suppose 
that B has more than h distinct eigenvalues on the circle \z\ — p(B). 
Then TFAE 

(a) B has qh eigenvalues the circle \z\ — p(B), for some q > 1. 

(b) Each d has q > 1 distinct eigenvalues on \z\ = p(Ci) = p(B) h . 

(c) Some d has q > 1 distinct eigenvalues on \z\ = p(Ci) = p(B) h . 

(d) Let D(B) = {(n),E) and V t = + 1, . . . ,Pi} for i = 1, . . . , h 
be defined as in the proof of Lemma 6.2.9. Then each Vi is a 
disjoint union of q nonempty sets Wi, Wi + h, ■ • ■ , W i+ ( q -i)h for 
i = 1, . . . , h, such that E C U^Wj x Wj+i, where := 
W\. Let = (Wj, Fj),Fj C x Wj be the following digraph. 
The diedge (v, w) is in Fj, if and only if there is a path of length 
qh in D(B) from v to w in Wj. Then each digraph Hj is strongly 
connected and primitive. 

Hint: Use the structure of the eigenvalues A of B on the circle 
|A| = p(B), and the corresponding eigenvector z to A given in (5b) of 
Theorem 6.2.1. 

11. For ,4 e #+ x "andO < x = (x 1 , . . . , x n ) T e W± let R(x) = max 4e(n) ^ 
Assume that A is irreducible. Show that inf x>0 i?(x) = p(A). I.e. 

(6.2.9) min max ^— ^ = p(A). 

x=(i,,...,i„)>0!€(n) 

Furthermore, i?(x) = if and only if Ax = p(A)x. 

Hint: Mimic the proof of Theorem 6.2.2. 

12. Let n > 1 and D = ((n), E) be a strongly connected digraph. Show 

(a) If D has exactly one cycle, it must be a Hamiltonian cycle, i.e. 
the length of of this cycle is n. Then D is not primitive. 

(b) Suppose that D has exactly two directed cycles. Then the short- 
est cycle has length n — 1 if and only if it is possible to rename 
the vertices so that the shortest cycle is of the form 1^2^ 
. . . — > n — 1 — ► 1 and the second cycle is a Hamiltonian cycle 
1— »2— »n— 1— »n— »1. In this case D is primitive. 
Moreover A(D) k > if and only if k > n 2 - 2n + 2. 

(c) Assume that D is primitive. Show that the shortest cycle of D 
has at most length n — 1. 
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6.3 Index of primitivity 

Theorem 6.3.1 Let A G ]R" X ™ be a primitive matrix. Let s > 1 be 
the length of the shortest cycle in the digraph D{A) — ((n),E). Then 
A s(n-2)+n > g j n particular A^-^ 2+1 > 0. 

Proof. For n = 1 we have that s — 1 and the theorem is trivial. Assume 
that n > 1. Note that since A is primitive s < n — 1. (See Problem 12c.) 

Suppose first that s = 1. So D(A) contains a loop. Relabel the vertices 
of D{A) to assume that (1,1) G E. I.e. we can assume that A = [a^-] 
and an > 0. Recall that from 1 to j > 1 there exists a path of length 
1 < < n — 1- By looping at 1 first n — 1 — times we deduce the 
existence of a walk of length n — 1 from 1 to j > 1 . Clearly, there exists a 
walk of length n—1 from 1 to 1: 1 —> 1 1. Similarly, for each j > I 

there exists a walk of length n—1 from j to 1. Hence, the first row and 
the column of A"^ 1 is positive. Thus, A 2( - n ~^ = A n ~ 1 A n ~ 1 is a positive 
matrix. 

Assume now that s > 2. Relabel the vertices of D(A) such that one has 
the cycle on vertices c := {1, 2, . . . , s}: 1 s —> 1. Then the first 

s diagonal entries of A s are positive. Since ^4 was primitive, Lemma 6.2.8 
implies that A s is primitive. Our previous arguments show that (A s )" _1 
has the first s rows and columns positive. Let 



A n-s = 



F21 F22 



,i<llfcJK + , ^12, i*2i t 1K + ,-r22tJl* + 



Clearly, Fu > ([aij]f = j = i)™ s - Since Z?(A) contains a cycle of length s on 
(s) it follows that each row and column of Fu is not zero. Clearly, 

(g 3 J) ^s(ri-2)+n _ ^(n-s)^s(n-l) 

Hence the first s rows of ^4*(™- 2 )+™ are positive. We claim that each row of 
F21 is nonzero. Indeed, take the shortest walk from j £ U := (s + 1, . . . , n) 
to the set of vertices V := {1, . . . , s}. This shortest walk is a path which 
can contain at most n — s vertices in U, before it ends in i G V. Hence the 
length of this path is m(j) < n — s. After that continue take a walk on the 
cycle c of length n — s — m(j), to deduce that there is a walk of length n — s 
from j to V. Hence the j — s row of F 2 i is nonzero. Use (6.3.1) and the 
fact that the first s rows of (A s )™ _1 positive to deduce that A s{n ~ 2)+n > 0. 

□ 



Proof of Theorem 6.1.5. Problem 6.1.5.1 yields that the length L 
of any closed walk in D is a sum of lengthes of a number of cycles in D. 
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Hence I divides L. Assume that D is primitive, i.e. A k > for any k > k . 
Hence for each k > ko there exists a closed walk in D of length k. Therefore 
1=1. 

Suppose now that D = (V, E) is imprimitive, i.e. A(D) is imprimitive. 
(5d) of Theorem 6.2.1 yields that there V decomposes to a nonempty dis- 
joint sets Vi,...,Vh, where h > 1. Moreover E C U^ =1 Vi x Vi+i, where 
Vh+i = V\. So any closed walk must be divisible by h > 1. In particular, 
the length of each cycle is divisible by h. Thus I > h > 1. Hence D is 
primitive if and only if I = 1. Suppose that D is primite. Theorem 6.3.1 
yields that A(D) s ( n ~ 2 ) +n > 0, where s is the length of the shortest cycle. 
This proves part 1 of Theorem 6.1.5. 

Assume now that D is imprimitive. So A(D) has h > 1 distinct eigen- 
values of modulus p(A(D)). Relabel the vertices of D so that A(D) is of 
the form B given in (5d) of Theorem 6.2.1. As we pointed out, each cycle 
in D is divisible by h. It is left to show that the I = h. Let Di = (Vi, Ei) 
be defined as in the proof of Lemma 6.2.9. It is straightforward to see that 
each cycle in Di corresponds of length L to a cycle in D of length hL. 
Since Cj is primitive, it follows from the first part of the proof, that the 
g.c.d of lengths of all cycles in Cj is 1. Hence, the g.c.d. of lengths of the 
corresponding cycles in D is h. □ 



6.4 Reducible matrices 

Theorem 6.4.1 Let A e M" x ". Then p{A), the spectral radius of A, 
is an eigenvalue of A. There exists a probability vector x <G LI n such that 
Ax = p(A)x. 

Proof. Let J n e {1 j.™ xn be a ma trix whose entries are 1. For e > let 
A{e) = A + eJ n . Then A(e) > 0. Hence, 

(6.4.1) p{A(e)) e spec (A(e)) and A(e)x(e), < x(e) e LJ n for e > o. 

Since the coefficients of the characteristic polynomial of A(e) are polynomial 
in e, it follows that the eigenvalues of A(e) are continuous function of e. 
Hence 

lim spec (A(e)) = spec A, lim p(A(s)) = p(A). 

e^Q vare— »0 

Combine that with (6.4.1) to deduce that p(A) G spec A. Choose eu = 
-r,k = 1,...,. Since II n is a compact set, there exists a subsequence 
1 < k\ < k 2 < ■ ■ ■ such that limj^ OCl x(ek j = x e LI n . The second equality 
of (6.4.1) yields that Ax = p(A)x. □ 
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It is easy to have examples where p(A) = for some A e R™ ™, and p(A) 
is not a geometrically simple eigenvalue. (I.e. the Jordan canonical form 
of A contains a Jordan block of order greater than one with the eigenvalue 
P(A).) 

Proposition 6.4.2 Let A e R" xn . 

1. Assume that C G M" x ™ and A>C. Then p(A) > p(C). If either A 
or C are irreducible then p(A) = p(C) if and only if A — C. 

2. Assume that B <G ]R™ xm ,l < m < n is a principle submatrix of 
A, obtained by deleting n — m rows and columns of A from a subset 
J C (n) of cardinality n — m. Then p(B) < p(A). If A is irreducible 
then p(B) < p(A). 

Proof. 1. Suppose first that A is irreducible. Then Lemma 6.2.4 yields 
that p(A) > p(C). Equality holds if and only if A = C. Suppose next that 
C is irreducible. Then A is irreducible. Hence p(A) > p(C), and equality 
holds if and only if C = A. 

Assume now that A is reducible. Let A(e),C(e) be defined as in the 
proof of Theorem 6.4.1. For e > the above arguments show that p(A(e)) > 
p(C(e)). Letting e \ we deduce that p(A) > p(C). 

2. By considering a matrix A\ = PAP T for a corresponding P e V n 

' A n A 12 ' 
A21 A 2 2 

p(Ai) = p(A), and A\ irreducible if and only if A irreducible. Let C = 
B mx(n _ m) Then ^ c 2 yidds that p ^ = 

p(Ai) > p(C). Suppose that Ai is irreducible. Since C is reducible, 
A x ^ C. Hence p{C) < p{Ax) = p(A). □ 



we may assume that A\ = 



where B — An. Clearly, 



Lemma 6.4.3 Let A e M™ x ™. Assume that t > p(A). Then (tl - 
A)^ 1 > 0. Furthermore, (tl — A)^ 1 > if and only if A is irreducible. 

Proof. Since t > p(A) it follows that det (tl - A) ^ 0. (Actually, 
det (tl - A) > 0. See Problem 1.) So (tl - A)- 1 exists and (tl - A)- 1 = 
j (I— jA)" 1 . Since p(\A) < 1 we deduce the Neumann expansion [Neu77], 
which holds for bounded operators in Banach spaces, 

00 

(6.4.2) (t/_^)-i = ^ - A \ for \t\>p(A). 

fc=0 

Since A^Owe deduce that (tl-A)- 1 > 0. Let A k = [a[f]. The the (i, j) 
entry of (tl — A)^ 1 is positive, if and only if a\V > for some k = k(i,j). 
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This shows that (tl — A) 1 > if and only if A is primitive. 



□ 



Theorem 6.4.4 Let A e M" x ™ be a nonnegative matrix. Then there 
exists a permutation matrix P e V n such that B = PAP T is of the following 
block upper triangular form. 



(6.4.3) 



B = 



B\\ B12 
B 22 



Bit -Bi(t+i) 
B 2 t B 2 { t +i) 



B l(t+2) 



B 



l(*+2) 



Btt B 



t(t+i) 



B, 



B( t +V){t+i) 



*(t+2) 



Bi(t+f) 
B 2 ( t +f) 

B t(t+f) 




B, 



(*+/)(*+/) 



0: 

, i, j = 1, . . . ,t + /, m + .. . + «*+/ = n, t > 0, / > 1. 



Each Bu is irreducible, and the submatrix B' := [Bjj]*lJ =t+1 is block diag- 
onal. Ift — then B is a block diagonal. If t > 1 i/ien /or each i = 1, . . . ,t 
not all the matrices -Bj(i+i), ■ ■ ■ , are zero matrices. 

Proof. Let L> r = (W,F) be the reduced graph of £>(A) = {(n),E). 
Then Z) r is a diforest. Let £ > 1 be the length of the longest path in the 
digraph D r . For a given vertex w € W let be the length of the longest 
path in D r from w. So £(w) € [0,^]. For j e {0, . . . ,£} denote by Wj the 
set of of all vertices in W such that £ (w) = j. Since D r is diforest, it follows 
that We, . . . , PFo is a decomposition of W to nonempty set. Note if there 
is a diedge in D r from Wi to Wj then i > j. Also we have always at least 
one diedge from Wi to for i = £,..., 1, if I > 0. 

Assume that #Wj — mi + {_ 3 for j = 0, . . . , £. Let Mo = and Mj = 
J2l=i m i f° r 3 = 1) • ■ • Then we name the vertices of Wj as {M^-j + 
1, . . . , Mf_j + mi + ^_j} for j = 0, . . . ,1 Let / := #VF = m £+ i and t := 

#(U^ =1 Wj-) = Ej=i TO i- Note that / > 1 and t = if and only if £ = 
0. Hence the representation matrix A(D r ) is strictly upper triangular. 
Furthermore the last / rows of A(D r ) are zero rows. 

Recall that each vertex in W corresponds to a maximal strongly con- 
nected component of D(A). That is, to each i e W = (t + f) one has 
a nonempty subset Vi C (n), which correspond to the maximal connected 
component of D(A). Let rii := #Vi for i = 1, . . . ,t + f. Let A = 
and Ni = J2]=i n «i * = 1, ■••,* + s. Rename vertices of to satisfy 
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Vi = {iVi-i + 1, . . . , Ni_! + m}. Then PAP T is of the form (6.4.3). Fur- 
thermore, the digraph induced by Bu is a strongly connected component 
of D(A). Hence Bu is irreducible. Note Bij = 0, for j > i if and only if 
there is no biedge from the vertex i to the vertex j in D r . Recall that for 
i < t, the vertex i represents a vertex in Wk, for some k > 1. Hence, for 
i < t there exists j > i such that B^ ^0. □ 

Theorem 6.4.5 Let A e M" x ™. Then there exists a positive eigenvec- 
tor x > o such that Ax = p(A)x if and only if the following conditions hold. 
Let B be the Frobenius normal form of A given in Theorem 6.4-4- Then 

!■ p{B(t+i)(t+i)) = ■■■ = p{B {t+f)(t+f) ); 
2. p(B u ) < p(B (t+1){t+1) ) for i = l,..., t. 

Proof. Clearly, A has a positive eigenvector corresponding to p(A), if 
and only if B has a positive eigenvector corresponding to p(B) = p(A). 
Thus we may assume that A = B. Suppose first that Bx = p(B)x for 
x > 0. Let x T = (uj, . . . , u t l +/ ), where < u, e for j = 1,...,« + 

/. Since B' — [Bjj]*^J =t+1 is a block diagonal matrix we deduce that 
BuUi = p(B)\ii for i = t+1, . . . ,t + f. Proposition 6.2.7 yields the equality 
P{B) = p{B( t +i)(t+i)) = ■■■ = p(-B(t + /)(t+/))- Hence 1 holds. Furthermore, 



Since for each ie [1, t] there exists an integer j(i) <G [i + 1, t + f] such that 
B^ > we deduce that BuUi < p(B)ui for each i E [l,t]. Use Problem 11 
to deduce that p(Bu) < p(B) = p(B V t+1 w t+1 )) for i e [l,t]. 

Assume now that 1 and 2 holds. Let r = p(B( t+1 )( t+1 )) = ... = 
p(-B( t+/) ( t+/ )). Then B u Ui = ru^Ui > for i = t + 1, . . . , t + f. Also, 
since p(B) = max ie ( t+ y) p(Ba), we deduce that p{B) = r. 

Consider the equality (6.4.4) for i = t. Rewrite it as 



Since some B t j > it follows that Vj > 0. As r > p(B t t) and B u is 
irreducible, Lemma 6.4.3 implies that (rl — Bu)^ 1 > 0. Hence u t :— 
(rl — _B tt )~ 1 v t > 0. Thus we showed that there exists u t > o so that equal- 
ity (6.4.4) holds for i — t. Suppose we already showed that there exists 
u t , . . . , Ufe > such that (6.4.4) holds for i = t, t — 1, . . . , k. Consider the 



(6.4.4) 




j=i+i 



= (rl - B tt )u, 



j=*+i 
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equality (6.4.4) for i = k — 1. Viewing this equality as a system of equa- 
tions in Uk-i, the above arguments show the existence of unique solution 
Ufe-! > 0. Let x T = (u^, ■ • ■ ,u 4 ^ / ). Then _Bx = rx. □ 

Corollary 6.4.6 Let A e M" x " and assume that Ax = p(A)x,A T y = 
p(A)y, where x,y > 0. TTien the Frobenius normal form of A, given by 
(6.4-3) is block diagonal. 

Theorem 6.4.7 Let A e R" x ™. Assume that Ax = x for some x > o. 
£? = [Bij]l'tj_ 1 be the Frobenius normal form of A given by (6.4-3). Denote 
B k = [B^]l+f =1 fork = 1,2,.... Then the block matrix form of B k is of 
the form (6.4-3). Furthermore, the following conditions hold. 

1. linifc^oo B\f =0fori,j = l,...,t. 

2. A k ,k — 1,2, ... , converge to a limit if and only if the matrices Bu 
are primitive for i = t + 1, . . . , t + f. 

3. Assume that Bu are primitive and BuUi — Uj,.B^Vj = Vj,Uj,Vj > 
0, vjiii = l for i = t + 1, . . . ,t + f. Then lim^oo B k = E = 
[Eij]{=j =1 > 0, where E has the block matrix form (6.4-3). Further- 
more 

(a) E is a nonnegative projection, i.e E 2 = E. 

(b) E u = UivJ fori = t+1, . . -,t+f, and = fori,] = l,...,t. 

(c) For each i£ (t) and a given row r in matrices -E^i+i), . .., Ein + f\, 
there exists j > t, j — j(r), such that Eij has a positive element 
in row j . 

Proof. Since B is a block upper triangular, it follows that B k is block 
upper triangular. Since = for j > i > t if follows that B^ = for 
j > i > t. (One can prove it by induction on k.) Let B := [Bij\\_-_ v Since 
B is block upper triangular it follows that B k = [B^ - =1 . Furthermore, 
p(B) = max i£ ( ( ) p(Bu). As B has a positive eigenvector, we deduce from 
Theorem 6.4.5 p(Bu) < p(S (t+1 )( t+1 )) = 1 for i e (t). Hence p(B) < 1. 
Therefore lim^oo B k = 0. This implies 1. 

Clearly, A k ,k = 1,2,... converges if and only if the sequence B k , k = 
1,2,... converges. Assume that the second sequence converges. As B^ = 
B^ for k = t + 1, . . . , t + f, we deduce that the sequences B^, k = 1,2,..., 
converge for i = [t + 1, t + /]. Since Bu is irreducible and p(Bu) = 1 for 
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i€ [t + 1, t + /], the convergence of B u , k — 1,2,... implies that the only 
eigenvalue of Bu on unit circle is 1. (Recall that the eigenvalues of B u are 
the /c-powers of the eigenvalues of Bu.) So each Bu is primitive. 

Assume now that each Bu is primitive. Hence, the algebraic multiplicity 
of the eigenvalue of 1 of B is /. We claim that the geometric multiplicity of 
1 is also /. Let u t+1 , . . . , u t+ f be defined as in 3. For any a t +\ , . . . , at+ / > 
we have that Sjj(ajUj) = ajUj for i = . . . , t+/. From the proof of The- 
orem 6.4.5 it follows that B has a positive eigenvector x, Bx = x, such that 
x T = (xj, . . . ,xj ,a t+1 uj +1 , . . . ,a t+f u t+ f),Xi e R+',i = i,...,t. Hence 
the subspace of eigenvectors of B corresponding to the eigenvalue 1 has 
dimension / at least /. Since the algebraic multiplicity of 1 is / it follows 
that the geometric multiplicity of 1 is /. As all other eigenvalues A of B 
satisfy |A| < 1 Fact ?? yields that lim^oo B k = E. This implies 2. 

Since B k has the same block upper triangular form as B it follows that 
E = [£Jy]*iJ =1 has the block triangular form. So Eij = for j > i > t. Fur- 
thermore, 1 implies that E^ — for i,j e (t). Let Uj,Vj,i = t + i, . . . ,t + f 
be defined as in 3. The proof of Lemma 6.2.8 yields that lim^oo B u = 
v.u^ = E u for i > t. Since B 2k = B k B k we deduce that E 2 = E. 3c will 
be proved later. □ 



Theorem 6.4.8 Let F e M™ xm be a projection, i.e. F 2 = F. Then P 
is permutationally similar to a nonnegative projection G, i.e. G = PFP T 
for some P e V m , of exactly one of the following forms. 

1. G — mX m- 

2. G — E, where n = m and E has a block upper triangular form given 
in conditions 3a-3c of Theorem 6.4-7. That is, one of the following 
conditions hold. 



(a) E = T, where T = diag^v^, • • • , UtvJ), where < Uj,Vj € 
i,i= i,...,t. 



(b) E = 



R 
T 



, where T is of the form given in (2a), and each k- 



row ofR is of the form (r kl vj , r kt vj), where (r kl , . . . , r kt ) > 







3. G 



where E e IR™ x "zs of the form described in 2, where 



E H 


1 < n < to. So each column of H either a zero column, or a nonzero 
nonnegative eigenvector of E corresponding to the eigenvalue 1. 
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Proof. Recall that spec F C {0,1} and F is similar to a diagonal matrix. 
Suppose that spec F = {0}. Then F = and the condition 1 holds. Assume 
now that 1 € spec F. So there exists x > such that Fx = x. Since 
F(F1) — (Fl) it follows that x = Fl is an eigenvector of F corresponding 
to 1 with the maximal number of nonzero coordinates. Suppose first that 
x has no zero coordinates, i.e. F does not have zero rows. Let B be the 
Frobenius normal form of F as in Theorem 6.4.7. As B 2 = B we deduce 
that E = B. So 2 holds. 

Assume finally that F has exactly m — n zero rows. So there exists 

Q e V n such that QF1 = (y T ,(£_J T . Thus QFQ T = ^ 
where F? — Fi and Fiy — y, y > o. Use 2 for f\ to deduce 3. □ 



Theorem 6.4.9 Let A = R^ xn . Then 
(6.4.5) p(A) =limsup(trA m )^. 

m^oo 

(Here tr B is the trace of a square matrix B, i.e. the sum of its diagonal 
entries.) 

Proof. Clearly, for any B E C nxn , \tiB\ = | £™ =1 A;(.B)|. Hcncc 
|trS| < np(B). Therefore, tr A m = \ trA m \ < np(A m ) = np(A) m . Thus, 
(trA m )^ < n^p(A). Therefore, limsup m ^ 00 (tr A m )^ < p(A). It is left 
to show the opposite inequality. 

Assume first that A is an irreducible and primitive. Let Au = p(A)u, A T v 
p(A)v 7 < u,v,v T u = i. Theorem 6.4.7 yields that limm^oo p ^ m A m = 

uv T . Hence 

tr A m > p(A)"4truv T = lim (tr A m )^ = p(A). 

Assume that A is an irreducible and imprimitive. If A 1 x 1 zero ma- 
trix, then (6.4.5) trivially holds. Assume that n > 1. Without loss of 
generality we can assume that A is of the form given in Theorem 6.2.1 
part 5d. Then A h = diag(Si, . . . , Bh), where each Bj is primitive and 

p(Bj) = p(A) h , see Lemma 6.2.9. So tr A hk = Ej=i trB j- Sincc each B j 
is primitive and irreducible, we deduce from the previous arguments that 
limfe^ 00 (tryl' lfe ) " = p(A). Hence (6.4.5) holds in this case too. 

Assume now that A is not irreducible. Without loss of generality we can 
assume that A is in the Frobenius form (6.4.3). Then there exists i € (t+f) 
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such that p(A) — p(Ba). Clearly 
t+f 

(tvA m )^ = (J2 trB ?j)™ ^ i^B™)™ => limsup(tr A m )™ > lim sup(tr B™)™ = p(A). 

□ 



Problems 

1. Let A e K" xn . Show that for t > p(A) det (tl - A) > 0. 

6.5 Stochastic matrices and Markov Chains 

Definition 6.5.1 ^4 matrix S is a called a stochastic matrix if S E 
M" x ™ ; for some integer n>l, and SI = 1. Denote 6j/ 5 n C R" x " tfte sef 
of n x n stochastic matrices. A matrix A is called doubly stochastic if A 
and A T is a stochastic matrix. Denote by Q n C <S„ the set of n x n doubly 
stochastic matrices. 

Note that A g E' ixn is a stochastic matrix, if and only if each row of A is 
a probability vector. Furthermore the S n and fi„ are compact semigroups 
with respect to the product of matrices, see Problem 1. The following 
lemma is straightforward, see Problem 2. 

Lemma 6.5.2 Let A E M" x ". Then A = DSD- 1 for some S E S n 
and a diagonal matrix D E M™ x ™ with positive diagonal entries if and only 
if Ax = x for some positive x E R n . 

Definition 6.5.3 Let S E S n be irreducible. We will assume the nor- 
malization that the eigenvector of S and S T corresponding to the eigenvalue 
1 are of the form 1 = (1„) E K™ and it E Tl n , respectively, unless stated 
otherwise. S is called aperiodic if it is primitive, and periodic if it is im- 
primitive. 

Theorem 6.5.4 Let A E S n . Denote by B — [Bjj]*^J =1 the Frobenius 
normal form of A given by (6.4.3). Then B, B (t+1){t+1) , . . . , B {t+f)(t+f) 
are stochastic. Furthermore the conditions l-3b of Theorem 6.4-7 hold. 
The limit matrix E is stochastic. In the condition 3 we can assume that 
Ui = l„ 4 ,Vj E U n% fori = t+l,...,t + f. 

Finally the condition 3c of Theorem 6.4-7 is replaced by the following 
stronger condition. For each i E (t) the sum of the entries of a row r in 
matrices E^ t+1 ^, . . . , E^ t+ ^ is 1, for any given row r. 



6.5. STOCHASTIC MATRICES AND MARKOV CHAINS 



309 



Proof. Since A is stochastic, in view of Problem lb, B is stochastic. 
Since B is block upper triangular and B' = [-B»j]*ijL t+1 is block diago- 
nal, it follows that Ba is stochastic for i = t + 1, . . . ,t + f. Since B k is 
stochastic, Problem lc implies that E — [£Jij]'^J =1 is stochastic. Since 

E = [Eij]l_j =1 = we deduce that the last part of the theorem. □ 

Proof of condition 3c of Theorem 6.4.7. In view of Lemma 6.5.2 B 
is diagonally similar to a stochastic matrix,. Hence 3c follows from the last 
part of Theorem 6.5.4. □ 

We now recall the classical connection between the stochastic matrices 
and Markov chains. To each probability vector tt = (tt\, . . . , ir n ) T € H n we 
associated a random variable X, which takes values in the set (n), such that 
P(X = i) = ni for i = 1, . . . ,n. Then n — n(X) is called the distribution 
of X. 

Assume that we are given a sequence of random variables Xq,X\, . . . 
each taking values in the set (n) . Let s be the conditional probability of 
Xk = j given that X^-i = i: 

(6.5.1) ■■=P(X k =j\X k - 1 = i), i,j = l,...,n, fc=l,... 

Clearly, S k = [s^]" =J - =1 , k = 1, 2, . . . is a stochastic matrix for k = 1, 

Definition 6.5.5 Let Xq, X\, . . . , be a sequence of random variables 
taking values in (n) . Then 

1. X ,Xi, . . . is called a homogeneous Markov chain if 

P{X k = jklXk-x = j fe _i, ...,X = jo) = P(-X"i = jk\X = jk-i) for k = 

2. X ,Xi, ... is called a nonhomogeneous Markov chain if 

P(X k = jk\X k -i = jk-i, ...,X = j ) = P(X k = jklXk-x = j fe _i) for k 

(Note that a homogeneous Markov chain is a special case of nonho- 
mogeneous Markov chain.) 

3. A nonhomogeneous Markov chain is said to have a limiting distribu- 
tion if the limit it oo^q) := lirrik^oo'^k exists. If TZoo does not depend 
on 7T then -k^ is called the stationary distribution of the Markov 
chain. 

The following lemma is straightforward, see Problem ??. 
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Lemma 6.5.6 Let Xq, X\, . . . be a sequence of random variables taking 

values in (n) . Let -K k ■= Tr(Xk) be the distribution of X k for k = 0, 1, // 

X n , Xi, . . . , is a nonhomogeneous Markov chain then nj = ttJ Si . . . Sk for 
k = 1, . . ., where Sk are defined by (6.5.1). In particular, if Xq, X\,. . . , is 
a homogeneous Markov process, i.e. Sk = S,k = 1,2, ... , then tvJ = 7Tq S k 
fork=l,...,. 

Theorem 6.5.7 Let Xq,X\,. . . , be a homogeneous Markov chain on 
(n), given by a stochastic matrix S — [s^] G S n . Let D(S) and D r (S) be 
the digraph and the reduced digraph corresponding to S. Label the vertices 
of the reduced graph D r (S) by {1, . . . , t+f}. Let V\, . . . , Vt+f be the decom- 
position of (n) to the strongly connected components of the digraph D(S). 
Assume that B = PSP T ,P G V n is given by the form (6.4-3). The ver- 
tices, (states), in U* =1 Vi are called the transient vertices, (states). (Note 
that if t = then no transient vertices exist.) The vertices, (states), in 
U*i/ +1 Vi are called the final vertices, (states). Vt+i, . ■ . , Vt+f are called the 
final strongly connected components. Furthermore the following conditions 
hold. 

1. For each i £ U* =1 Vj lim^oo P(X k = i) = 0. 

2. X , X\,..., have a limiting distribution if and only if each stochastic 
matrix corresponding to Vi is aperiodic for i = t + 1, . . . ,t + f . I.e. 
the irreducible matrices Bu are primitive for i = t + 1, . . . , t + /. 

3. Xq,X\,... have a stationary distribution if and only if f = 1. I.e. 
there exists only one final strongly connected component. 

Proof. . Without loss of generality we may assume that S — B. Let 

ttJ = (ttJ k , • ■ • , 7r J + j k ) for k = 0, 1, From the proof of Theorem 6.4.7 

we deduce that 

i 

n lk = KjflBji ' for i = 1, . . . ,i. 

In view of part 1 of Theorem 6.4.7 we deduce that lim^oo 7r^fe = for 
i = 1, . . . ,t. This proves part 1. 

Suppose that tvq = (wi, 0, . . . , is supported only on Vt+i for some i G (/). 
That is TTjfi = if j £ V t +i- Then each iZk is supported on V t +i- Further- 
more, TrJ+ it k = ^l+ifi B \t+i)(t+i)- Assume that S (t+i )( t+i) is imprimitive. 
Choose TVt+i.o to have one nonzero coordinate to be 1 and all other coor- 
dinates to be zero. Assuming that B( t +i)(t+i) h as the form given in 5d of 
Theorem 6.2.1, we see that there no is limit distribution. Hence, to have the 
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limit distribution for each tvq we must assume that B( t+i }( t+i ) is primitive 
for i = l,...,f. 

Assume that -B(t+i)(t+i) is primitive for i = 1,...,/. Then Theorem 
6.5.4 implies that ttJ^ = tv E, where E = lim^oo B k is the stochastic 
projection. As we pointed out before if we assume that tv is supported 
only on Vt+% then the limit probability is also supported on Vt+ ». Hence, 
to have a stationary distribution we must have that / = 1. 

Assume that / = 1. Then lim^oo B^ +1 = l t + 1 TvJ +1 . Observe that the 
limit probability is tTqE — (tTq E)E. Since 7Tq E is supported only on V t +\ 
it follows that tvJE 2 — (0 T , . . . , T ,tvJ +1 ), which is independent of 7r . □ 

s v ' 

t 

The proof of the above theorem yields the well known result. 

Corollary 6.5.8 Let X$, X\, . . . , be a homogeneous Markov chain on 
(n), given by an aperiodic stochastic matrix S — [sij] € S n . Assume that 
S t tt = 7T for a unique < 7r G 77„. Then this Markov process has a 
stationary distribution equal to 7r. 

A stronger result is proven in [Fri06]. 

Theorem 6.5.9 Let Xq,X\, . . . , be a nonhomogeneous Markov chain 
on (n), given by the sequence of stochastic matrices 5i,S*2,..., defined in 

(6.5.1) . Assume that lim^oo Sk = S, where S is a stochastic matrix,. 
Suppose furthermore that the corresponding homogeneous Markov chain to 
S has a stationary distribution tv. Then the given nonhomogeneous Markov 
process has a stationary distribution equal to 7r. 

We close this section with Google 's Page Ranking. Let n be the current 
number of Web pages. (Currently around a few billions.) Then S = [sij] G 
S n is defined as follows. Let A(i) C (n) be the set of all pages accessible 
from the Web page i. Assume first that i is a dangling Web page, i.e. 
A(i) = 0. Then = - for j = 1, . . . , n. Assume now that rij = #A(i) > l. 
Then = ^- if j e A(i) and otherwise Sij = 0. Let < lj e II n ,t G (o, l). 
Then the Google positive stochastic matrix is given by 

(6.5.2) G = tS + (1 - t)lu> T . 

It is rumored that t ~ 0.85. Then the stationary distribution corresponding 
to G is given by G T 7r = tt £ H n . The coordinates of 7r = (m, . . . , n n ) T 
constitute Google's popularity score of each Web page. I.e. if 7Tj > Hj then 
Web page i is more popular than Web page j. 

A reasonable choice of u> would be the stationary distribution of yes- 
terday Google stochastic matrix. To find the stationary distribution 7r one 
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can iterate several times the equality 

(6.5.3) = TrjJliG, k = l,...,N. 

Then ir N would be a good approximation of tv. One can choose 7r = w. 



Problems 

1. Show 

(a) If Si, S2 G 5„ then <Si<S2 G 5„. 

(b) P5„ - S„P - 5„ for any P G TV 

(c) <S„ is a compact set in M™ x ™. 

(d) if s u s 2 e n n then s v s 2 g n n . 

(c) Pft„ = ft n P = n„ for any P G P„. 

(f) fi„ is a compact set in M" x ™. 

(g) P„ is a group of doubly stochastic matrices of cardinality n\. 

2. Prove Lemma 6.5.2. 

3. Let A,B G C" xn , and assume that A and P are similar. I.e. A = 
TBT^ 1 for some invertible T. Then the sequence A k ,k = 1,2, ... , 
converges if and only if B fc , fc = 1,2,..., converges. 

4. Prove Lemma 6.5.6. 

6.6 Friedland-Karlin results 

Definition 6.6.1 Let B = [bij]f =j=1 € M nxn . B is ca//ed a Z-matmx 
if bij < /or each i ^ j. B is called an M -matrix if B = rl — A where 
A G M" x ™ and r > p(A). For r = p(A) B is called a singular M -matrix. 

The following result is straightforward, see Problem 1. 

Lemma 6.6.2 Let B = [b tJ ] G R nxn be a Z-matrix. Let C = [cy] G 
M" x ™ be defined as follows. Cij = —bij for each i ^ j and cu — ro — b„ 
for i = 1, . . . , n, where r = max ie ( n j . Then B = r I — C. Furthermore, 
B = rl - A for some A G M" x ™ if and only if r = r Q + t, A = tl + C for 
some t>0. 

Theorem 6.6.3 Let B G R nxn be a Z - matrix. Then TFAE. 
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1. B is an M -matrix. 

2. All principal minors of B are nonnegative. 

3. The sum of all k x k principal minors of B are nonnegative for k = 
l,...,n. 

4- For each t > there exists < x £ R™ , which may depend on t, such 
that Bx > —tx. 

Proof. 1 2. Wc first show that dct B > 0. Let Xi{A), . . . , X n (A) 
be the eigenvalues of A. Assume that Xi(A) is real. Then r — Xi(A) > 
p(A) — Xi(A) > 0. Assume that Xi(A) is complex. Since A is a real valued 
matrix, Xi(A) is also an eigenvalue of A. Hence (r — Xi(A))(r — Xi(A)) = 
\r-X t (A)\ 2 > 0. Since det B = Y[ n l=1 (r - X t (A)) , we deduce that det B > 0. 
Let B' be a principal submatrix of B. Then B' = rl' — A' , where A' is a 
corresponding principal submatrix of A and /' is the identity matrix of the 
corresponding order. Part 2 of Proposition 6.4.2 implies that p(A) > p(A'). 
So B' is an M-matrix, Hence det B' > 0. 

2^3. Trivial. 

3 1. Let det {tl + B) = t n + £" =fe f3 k t n - k . Then (i k is the sum of all 
principal minors of B of order k. Hence /3fc > for k = 1, . . . , n. Therefore 
< dct (tl + B) = dct {{t + r)I - A). Recall that det (p(A)I - A) = 0. 
Thus t + r > for any t > 0. So r > i.e. £> is an M-matrix. 

1^4. Let i > 0. Use the Neumann expansion (6.4.2) to deduce 
that (tl + B)- 1 > j^I. So for any y > x := (tl + B) _1 y > 0. So 
y = (tl + B)x > 0. 

4 1. By considering PBP T = rl — PAP 1 we may assume that 
A = [Ay]*j^ =1 is in the Frobenius normal form (6.4.3). Let Bx > —tx. 
Partition x T = (x^, ■ • • ,xj + j). Hence (t + r)xi > AuXi. Problem 11 yields 
that t + r > p(Au) for i = 1, . . . , t + f. Hence t + r > p(A). Since t > 
was arbitrary we deduce that r > p(A). □ 

Corollary 6.6.4 Let B be a Z-matrix. Assume that there exist x > 
such that Bx > 0. Then B is an M-matrix. 

Lemma 6.6.5 Let B = [/jj]" =J - =1 be real symmetric matrix with the 
eigenvalues Xi(B) > ... > X n (B). Assume that X n (B) is a simple eigen- 
value, i.e. X n -i(B) > X n (B). Suppose that Bx = X n (B)x, where x £ 
]R",x T x = l. Let UcR" be a subspace which does not contain x. Then 



(6.6.1) 



min y T By > X n (B). 

y£U,y T y=i 
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Proof. Recall the minimum characterization of X n {B) 
(6.6.2) min z T Bz = XJB). 

z£l»,z T z=i 

See for example http://www.math.uic.edu/^fricdlan/math3101ec.pdf page 
114. Equality holds if and only if Bz = X n (B)z, where z T z = l. Since 
X n (B) is simple it follows that z = ±x. Since U does not contain x, 
hence it docs not contain — x, we deduce that the minimum in (6.6.1) is 
achieved for some y* ^ ±x. Hence this minimum is greater than X n (B). □ 



Corollary 6.6.6 Let B G R nx ™ be an M -singular symmetric matrix of 
the form B = p(C)I — C, where C is a nonnegative irreducible symmetric 
matrix. Let U = {y G R", l T y = 0}. Then X n (B) = is a simple 
eigenvalue, and (6.6.1) hold. 

As usual, we let ||z|| := y/z*z for any z G C" be the Euclidean norm of z. 

Theorem 6.6.7 Let T> C R m be a bounded domain. (D is open and 
connected, dT>, the boundary of V, is a compact set, so T> U dT> is a com- 
pact set in M. m .) Let f G D — > R be C 2 (T>). i.e. the function and is 
derivatives up the second order are continuous. Suppose that f\dV = oo, 
i.e. for each sequence Xj £ D,i = such that lim^ooXj = x G dV, 

lirm^oo /(xj) = oo. Assume furthermore, that for each critical point £ €, 
i.e. V/(£) = ■ ■ ■ , ,gjr~(£)) T = ^' ^ e eigenvalues of the Hessian 

dxidx ■ (€)]i=j=i are positive. Then f has a unique critical point 
£ G V, which is a global minimum, i.e /(x) > /(£) for any x G T> 

Proof. Consider the negative gradient flow 

(6.6.3) ^ = -V/(x(t)), x(t )-x GP. 

Clearly, the fixed points of this flow are the critical points of /. Observe 
next that if x is not a critical point then f(x(t)) decreases, as ^-^f^ = 
— 1| V/(x(i))|| 2 . Since f\&D = oo, we deduce that all accumulations points 
of the flow x(t), t G [t , oo) are in T), and are critical points of /. Consider 
the flow (6.6.3) in the neighborhood of a critical point £ € T>. Let x = 
y + x D = y + The for x close to £ the flow (6.6.3) is of the form 

^ = -(#«)y + Er(y)). y(i ) = y . 

For a given 6 > the exists e = e(8) > such that for ||y|| < e, ||Er(y)|| < 
6||y||. Let a > be the smallest positive eigenvalue of H(£). So z T H(£)z > 
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a||z|| 2 for any z G R m . Choose e > so that ||Er(y|| < f ||y|| for ||y|| < e. 
Thus 

= -2(y(t) T ff(0y(t) +y(t) T Er(y))| < -a|y|| 2 if |y(t)| < e. 

This shows that if ||y(i )|| < e then for t > t ||y(i)|| decreases. Moreover 
dlog||y(i)f 

^- „ r — j- \ j- i II../4.MI2 ^ II.. ip p — a(t— 1 ) 



dt 



< -a for t > t => \\y(t)\\* < \\y \\ 2 e~ a ^^ for t > t . 



This shows that lim^oo y(t) = 0. Let (3 > a be the maximal eigen- 
value of H(£). Similar estimates show that if ||y || < e then ||y(i)|| 2 > 

||y o ||2g-(2/3+Q)(t-t )^ 

These results, combined with the continuous dependence of the flow 
(6.6.3) on the initial conditions x OJ imply the following facts. Any flow 
(6.6.3) which starts at a noncritical point x Q must terminate at t = oo 
at some critical point £, which may depend on x . For a critical point £, 
denote by the set A(£) all points x Q for which the flow (6.6.3) terminates 
at finite or infinite time at £. (The termination at finite time can happen 
only if x = £.) Then A(£) is an open connected set of V. 

We claim that -4(£) = V. If not, there exists a point x G dA(£) n V. 
Since A(£) is open, x Q G" A(£). As we showed above x D G A(£ ) for some 
another critical point £' ^ £. Clearly A(£) n A(£') = 0. As A(£') is open 
there exists an open neighborhood of x D in V which belongs to A(£). Hence 
x can not be a boundary point of -4.(£), which contradicts our assumption. 
Hence A(£) = V, and £ is a unique critical point of / in V. Hence £ is the 
unique minimal point of /. □ 



Theorem 6.6.8 Let A = [a^]" - =1 G ]R™ X " be an irreducible matrix. 
Suppose furthermore that an > for i = 1, . . . , n. Let w = (w x , . . . , w n ) T > 
0. Define the following function 

n f A \ 

(6.6.4) / = /a, w = ^2 Wi log % ~> x = ( x n---^n) > 0. 

i=l Xl 

.Lei T> be the interior ofIi n , the compact set of probability vectors in W 1 . 
(V can be viewed as an open connected bounded set in IR™ -1 , see the proof.) 
Then f satisfies the assumptions of Theorem 6.6.7. Let < £ G LJ n be the 
unique critical point of f in T>. Then /(x) > /(£) for any x > 0. Equality 
holds if and only if x = t£ for some t > 0. 

Proof. Observe that any probability vector p = (p 1; . . . ,p n ) T can be 
written as p = ^1 + y where y G M™,l T y = o and y > — ^1. Since 
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an Y y € K™, l T y = o is of the form y = 1, 1,0..., o) T + . . . + 

u n -^{— 1, o, . . . , o, i) T we deduce that we can view II„ as an compact con- 
nected set in M n_1 , and its interior T>, i.e. < p e i7„, is an open connected 
bounded set in R n_1 . 

We now claim that f\dH n = 00. Let < = (p lt k, ■ ■ ■ ,Pn,k) T € 
II n , k = 1, . . . , converge to p = (p 15 . . . ,_p n ) T € <9i7 n . Let 7^ Z(p) C (n) 
be the set of vanishing coordinates of p. Observe first that > > 

for i = l,...,n. Since A is irreducible, it follows that there exists I € 
Z(p),j € (n)\Z(p) such that ay > 0. Hence 

Jim VPhk> lim »^ = oo. 



Thus 



lim /(pfe) > lim log ^ Pfc ^' + V" l g a ti = 00. 

k — »r*~, £■ — vrv-, 111 j„ * 



Observe next that /(x) is a homogeneous function of degree on x > 0, 
i.e. /(fx) = /(x) for all t > 0. Hence = 0. Thus 

(6.6.5) x T V/(x) = 

for all x > 0. Let £ e V be a critical point of f\V. Then y T V/(£) = o for 
each y e R n , l T y = 0. Combine this fact with (6.6.5) for x = £ to deduce 
that I is a a critical point of / in R™ . So V/(£) = 0. Differentiate (6.6.5) 
with respect to Xi,i = 1, . . . ,n and evaluate these expressions at x = 
Since £ is a critical point we deduce that H(£)£ = 0. We claim that -ff (£) 
is a symmetric singular M-matrix. Indeed 

(6.6.6) V {x)= i + £ 



Hence for I ^ j 



dxj Xj (Ax)j 

C J / \ \ - aijau 



So H (x) is a Z-matrix for any x > 0. Since H(£)£ — Corollary 6.6.4 yields 
that H(£) is a symmetric singular M-matrix. So H(£) = p(C)I — C,C — 
[ c ij]i=j=i- We claim that C is an irreducible matrix. Indeed assume that 
aji > 0. Then 

d 2 f > „, a n a 3l 
d Xl dxj ~ "■' (Ag)?j 



c 'ji = C U = - - w i /ac\2 > °- 
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Since A is irreducible C is irreducible. Hence = \ n (H(£))is a simple 
eigenvalue of H(£). The restriction of the quadratic form corresponding to 
the Hessian of f\H n at £, corresponds to y T H(£)y where l T y = o. Corol- 
lary 6.6.5 implies that there exists a > such that y T H(£)y ^ ct||y || 2 for 
all l T y = o. Hence the Hessian of f\T> at the critical point < £ e LT n 
has positive eigenvalues. Theorem 6.6.7 yields that there exists a unique 
critical point £ G V of f\T> such that /(p) > /(£) for any p e Since 
/(x) is a homogeneous function of degree we deduce that /(x) > /(£) for 
any x > 0. Equality holds if and only if x = t£ for some t > 0. □ 

Theorem 6.6.9 Let A e M" xn and assume that 
Au = p(A)u,A T v = p(A)v,o < p(A),0 < u = . . . ,it„) T ,v = (u l7 
T/ien 

n f A \ 

(6.6.7) ^Uifilog > log p(A) for any x = (x r , . . .,x n ) T > 0, 

i=i ^ 

n 

(6p$$)4) > J| d^ 1 /or any diagonal D = diag(di, . . . , d n ) > 0. 

i=l 

Equality holds for x = £u and Z? = si, where t > 0, s > 0, respectively. 
Assume that A is irreducible and all the diagonal entries of A are positive. 
Then equality holds in (6.6.7) and (6.6.7) if and only if x = tu, D — si for 
some t > 0, s > respectively. 

Proof. Assume that A = [aij]" - =1 e M" x ™ be irreducible and an > 
for i = 1, . . . ,n. Let w = (u 1 v 1 , . . . ,u n v n ) T . Define /(x) as in (6.6.4). We 
claim that u is a critical point of /. Indeed, (6.6.6) yields 

— (u) = - UjVj -+^ UiVi — = -v j+m {A v), = o, j = i, . . . ,n. 

Similarly, tu is a critical point of / for any t > 0. In particular, £ = iu £ 
is a critical point of / in V. Theorem 6.6.8 implies that /(x) > /(u) = 
logp(yl) and equality holds if and only if x = tu for some t > 0. 

Let D be a diagonal matrix with positive diagonal entries. Then DA 
is irreducible, and DAx = p(DA)x for some x = (x l7 . . . ,x„) T > 0. Note 
that since /(u) < /(x) we deduce 

n f A \ n 

log p{A) < ^2 u i v i lo § — = lo S P( DA ) - UiV% log di ' 
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The above inequality yields (6.6.8). Suppose that equality holds in (6.6.8). 
Then x = tu, which yields that D = si for some s > 0. Suppose that 
D > and D has at least one zero diagonal element. Then the right- 
hand side of (6.6.8)is zero. Clearly p(DA) > 0. Since d^cm is a principle 
submatrix of DA, Lemma 6.4.2 yields that p(DA) > max l£ ( n ) dia„. Hence 
p(DA) = if and only if D = 0. These arguments prove the theorem when 
A is irreducible with positive diagonal entries. 

Let us now consider the general case. For e > let A(e) := A + euv T . 
Then A(e) > and A(e)u = (p(A) + e)u, A(e) T v = (p(A) + e)v. Hence 
inequalities (6.6.7) and (6.6.8) hold for A(e) and fixed x > 0, D > o. Let 
e \ to deduce (6.6.7) and (6.6.8). For x = tu, D = si, where t > 0, s > 
one has equality. □ 



Corollary 6.6.10 Let the assumptions of Theorem 6.6.9 hold. Then 
(6.6.9) UjVj (~ Ax ^ t > p (A) for any x=(x 1 ,.. . ,x n ) T > 0. 

i=l 

If A is irreducible and has positive diagonal entries then equality holds if 
and only ifx = tu for some t > 0. 

Proof. Use the arithmetic-geometric inequality J2i=iPi c i — Oi=i c f i 
for any p = (p ± , ...,p n )eLT n and any c = (c 15 . . . , c„) T > 0. □ 



Definition 6.6.11 Let x = (x l7 . . . , x n ) T , y = (y 1 , . . . , y n ) T e C™. 
Denote by D(x) = diag(x) the diagonal matrix diag(xi, . . . ,x n ), by e x := 
(e Xl , . . . , e Xn ) T , by x _1 := (^-, . . . , ^-) T for x > 0, and Ji/xoy the vector 
{x 1 y 1 ,...,x n y n ) T . 

For a square diagonal matrix D = diag(di, . . . , d n ) € C nxn denote by 
x(D) the vector (d\, . . . , d n ) T . 

Theorem 6.6.12 Let < u = (u l7 . . . , u n ) T , v = (v 1 , . . . , v n ) T ■ Let 
0<w=uov. Assume that A — [ajj]" =J - =1 € IR™ X ™ is irreducible. Then 
there exists two diagonal matrices Di,D 2 € K" xn , with positive diagonal 
entries, such that DiAD 2 u = u,(D ± AD 2 ) T v — v if one of the following 
conditions hold. Under any of this conditions D\,D 2 are unique up to the 
transformation t~ 1 Di,tD 2 for some t > 0. 

1. All diagonal entries of A are positive. 
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2. Let Af C (n) be a nonempty set of all j G (n) such that djj = 0. 
Assume that all off-diagonal entries of A are positive and the following 
inequalities hold. 

(6.6.10) ^2 w *> W J f° r al1 3 e 

ie{n)\{j} 

(For n = 2 and Af = (2) the above inequalities are not satisfied by 
any w.) 

Proof. We first observe that it is enough to consider the case where 
u = 1, i.e. B := D1AD2 is a stochastic matrix. See Problem 2. In this 
case w = v. 

1. Assume that all diagonal entries of A are positive. Let / = /^ ;W be 
defined as in (6.6.4). The proof of Theorem 6.6.8 yields that / has a unique 
critical < £ in H n . (6.6.6) implies that 

n 

This is equivalent to the equality (D(A^)^ 1 AD(£)) J w = w. A straightfor- 
ward calculation show that D(A^)- 1 AD(£)1 = 1. Hence D x = D(A^)~ 1 ,D 2 = 

Suppose that D\ , D 2 are diagonal matrices with positive diagonal entries 
so that D 1 AD 2 1 = 1 and (D 1 AD 2 ) T w = w. Let u = D 2 1. The equality 
D 1 AD 2 1 = 1 implies that D 1 = D(Au)- 1 . The equality (L>iAD 2 ) T w = w 
is equivalent to (D(A(u))~ 1 AD(u)) T w = w. Hence u is a critical point of 
/. Therefore u = t£. So D 2 = tD(£) and D x = tr 1 D(A£)~ 1 . 

2. As in the proof of Theorem 6.6.8, we show that / = f Wi A is blows 
up to 00 as p approaches dH n . Let < = (p lt k, ■ ■ ■ ,Pn,k) T & LT n , k = 
1, . . . , converge to p = (p ± , . . . ,p n ) T G dLI n . Let ^ Z(p) C (n) be 
the set of vanishing coordinates of p. Since all off-diagonal entries of A 
are positive, it follows that lim^oo ( Apk J l = 00 for each i G Z(p). To 
show that linife^oo f(pk) = 00 it is enough to consider the case where 
limfc^oo ^ Pfc ^ m = for some m ^ Z(p). In view of the proof of Theo- 
rem 6.6.8 we deduce that m £ Af. Furthermore, ffZ(p) = n — 1. Hence 
linifc^oo p m ,k = 1. Assume for simplicity of notation that m = 1, i.e. 
Z(p) = {2, 3, . . . , n}. Let s fe = max;> 2 Pi. k - So lim^^ s k = 0. Let a > 
be the value of the minimal off-diagonal entry of A. Then ( Apk ^ > api - k 

° ■' Pi,k — s k 

for i > 2. Also > ^t. Thus 

— Pl.fc — Pi.k 

n 

f(Pk) > w x log — w i lo S — — = C/~2 w lo § a+{~Wi+ Y] Wi) log — • 
Pi.k ^ s k ^ ^ s k 

' Z>2 2—1 l>2 



320 



CHAPTER 6. NONNEGATIVE MATRICES 



(6.6.10) for j = 1 implies that lim^oo f(pk) = oo. 

Let < £ G TI n be a critical point of /. We claim that H(£) = 
p(C)I — C, and < C is irreducible. Indeed, for j ^ I Cji = Y^i=i w i riiri ■ 
Since n > 3 choose i ^ j,l to deduce that Cji > 0. So C has positive off- 
diagonal entries, hence irreducible. Hence < £ e i7„ is a unique critical 
point in II„. The arguments for 1 yield the existence of Di,D 2 , which are 
unique up to scaling. □ 



Problems 

1. Prove Lemma 6.6.2. 

2. Let B E R" xn and assume that Bu = u, B T v, where < u = 
(«!, . . . ,w„) T ,v = (vx, . . . ,v n ) T . Then C := D(u)~ 1 BD(u) satisfies 
the following CI — 1, C T w = w, where w = u o v. 

3. Let A e M" x " is called fully indecomposable if there exists PeP„ 
such that PA is irreducible and have positive diagonal elements. Show 
that that if A is fully indecomposable, then there exists diagonal 
matrices Di,D 2 , with positive diagonal entries such that D\AD 2 is 
doubly stochastic. D\,D 2 are unique up to ascalar factor t~ 1 Di,tD 2 . 



6.7 Convexity and log-convexity 

Definition 6.7.1 

1. For any two points x, y £ R" denote by (x, y) and [x, y], the open and 
the closed interval spanned by x,y respectively. I.e. the set of points 
of the form fx + (l — f)y, where t e (0, 1) and [0, 1], respectively. 

2. Set D C K™ is called convex if for any x,y G D the open interval 
(x,y) is in D. (Note that a convex set is connected). 

3. For x < y E R n we denote [x,y] := {z E R™, x < z < y}. Clearly, 
[x, y] is a convex set. 

4- Let D C R" be a convex set. Assume that f : D — > R. / is caZferf 
convex if 
(6.7.1) 

/(fx + (i - fy)) < f.f (x) + (i - f)/(y) /or a// f E (o, i),x, y € D. 
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G D. 



/ is called strictly convex or strictly log-convex if strict inequality holds 
in (6.7.1) and (6.7.2), respectively. 

Note that if / is a positive function on D then / is log-convex if and only if 
log / is convex on D. See Problem 1. The following results are well known, 
e.g. [Roc70]. 

Fact 6.7.2 

1. H = x + U := {y G R",y = x + u, u G U}, is a convex set if U 
is a convex set. If U is a subspace, then H is called a hyperplane of 
dimension k, where k is the dimension ofk. 0- dimensional hyperplane 
is a point. 

2. For a given convex set D £K™ let U = span {y— x; for all y,x£ D}. 
Then for any x G D, the hyperplane x + U is the minimal hyperplane 
containing D. The dimension o/D is defined as the dimension of of 
U, denoted as dim D = dim U. TTie interior o/D, denoted by J D, is 
the set of the interior points of D — x, for a fixed x G D, viewed as 
a set in the dim D dimensional subspace U. T/ien i/ie closure o/D is 
egwaZ to i/ie closure of JD, denoted as clo D = clo JD. 

5. Lei D C M n 6e a convex set, and f : D — > K a convex function. Then 
f : J D — ► R is a continuous function. Furthermore, at each x G / -D, 
f /ias a supporting hyperplane. That is there exists p = p(x) G R" 



Assume furthermore that dim D = n. TTien i/ie following conditions 
are equivalent. 

(a) f is differ entiable at x G / D. I.e. t/ie gradient of f, V/, at x 
eziste fliid i/ie following equality holds. 



(b) f has a unique supporting hyperplane at x. 

The set of points Diff(f) C J (D), where f is differ entiable, is a dense 
set inD of the full Lebesgue measure, i.e. D\Diff(f) has zero Lebesgue 
measure. Furthermore, V/ is continuous on Diff(f). 



(6.7.3) 



/(y) > /(x) + p (y - x) for any y G D. 



(6.7.4) ||/(y) - (/(x) + V/(x) T (y - x))|| = (||y - x||). 
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4- Let D C R" be a convex set, and f : D — > R a function. Assume 
that f e C 2 (/D), i.e. f and its first and second partial derivatives 
are continuous in JT). Then f : JD — > M. is convex iff and only the 
Hessian matrix H(x) = [ g^.g x . (x)]"_ J _ 1 has nonnegative eigenvalues 
for each x G JD. // £/ie Hessian -ff(x) /ias positive eigenvalues for 
each xe/fl, £/ien / is strictly convex on JD. 

5. Let D C R" 6e a convex set. 

(a) If f,g are convex on D i/ien max(/,g), where max(/, := 
max(/(x), 5(2:)), is convex. Furthermore, a f + bg is convex for 
any a, 6 > 0. 

(b) Let fi'.D^R for i = 1,2,.... Denote by f :— limsup^ fi the 
function given by /(x) := limsup i /(x) for each x e D. Assume 
that f : D — > R, i.e. £/ie sequence fi(x),i = 1, . . . , is bounded for 
each x. If each fi is convex on D then f is convex on D. 

Theorem 6.7.3 Let D e M m 6e a convex set. Assume that a,j : D — > R 
are log- convex functions for i,j = 1, . . . ,n. Let A(x) := [ay(x)]™ =J=1 &e £/ie 
induced nonnegative matrix function on D. TTien p(A(x)) is a log-convex 
function on D. Assume furthermore that each ajj(x) S C k (/D), /or some 
fc > 1. f-AZZ partial derivatives of aij(x) 0/ order /ess or eg«a/ £0 k are 
continuous in JD.) Suppose furthermore that A(x ) is irreducible. Then 
0< P 04(x))eC k (/D). 

Proof. In view of Fact 6.7.2.5a each entry of the matrix A(x) m is log- 
convex. Hence tr A(x) m is log-convex, which implies that (trA(x) m )™ is 
log-convex. Theorem 6.4.9 combined with Fact 6.7.2.5b yields that p(A(x)) 
is log-convex. 

Assume that A(x ) is irreducible for some x Q e J (D). Since each ay(x) 
is continuous in JD, Problem 1 yields that the digraph D(A(x)) is a con- 
stant digraph on /D. Since D(A(x )) is strongly connected, it follows that 
D(A(x))) is strongly connected for each x e J D. Hence A(x) is irreducible 
for x e J D and p(A(x)) > o is a simple root of its characteristic polynomial 
for x e J D. The implicit function theorem implies that p(A(x)) e C k (J D). 

□ 

Theorem 6.7.4 Let A e E" xn . Define A(x) = D(e*)A for any x e 
R™. Then p(A(x)) is a log-convex function. Suppose furthermore that A 
is irreducible. Then logpf^x)) is a smooth convex function on R n , i.e. 
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log p(A{x)) e C°°(R n ). Let 

A(x)u(x) - p(A(x))u(x), A(x) T v(x) - p(A(x))v(x), 
< u(x),v(x), wii/i the normalization w(x) =: u(x) o v(x) e 7T„. 

T/ien 

(6.7.5) Vlogp(A(x)) = _^Vp(,4(x)) = w(x). 

That is, the inequality (6.6.8) corresponds to the standard inequality 

(6.7.6) log p(A(y)) > log p(A(0)) + V log p(A(0)) T y, 
for smooth convex functions. 

Proof. Clearly, the function /i(x) = e Xi is a smooth log-convex function 
for x = (x 17 . . . , x n ) T € M™. Since A > it follows that each entry of A(x.) 
is a log-convex function. Theorem 6.7.3 yields that p(A(x)) is log-convex. 

Assume in addition that A = A(0) is irreducible. Theorem 6.7.3 yields 
that log p(A(x)) is a smooth convex function on R n . Hence logp(A(x)) has 
a unique supporting hyperplane at each x. For x = o this supporting hy- 
perplane is given by the right-hand side of (6.7.6). Consider the inequality 
(6.6.8). By letting D — D(e y ) and taking the logarithm of this inequal- 
ity we obtain that log p(A) + w(0) T y is also a supporting hyperplane for 
log p(A(x)) at x = 0. Hence Vlogp(A(0)) = w(0). Similar arguments for 
any x proves the equality (6.7.5). □ 

Problems 

1. Let D C R m be a convex set. 

(a) Show that if / is a continuous log-convex on D, then either / 
identically zero function or positive at each x G D. 

(b) Assume that / is positive on D, i.e. /(x) > o for each x e D. 
Then / is log-convex on D if and only if log / is convex on D . 

(c) Assume that / is log-convex on D. Then / is continuous on JD. 

2. Let / : D — > R be a log-convex function, show that / is a convex 
function. 

3. Let D C R n be a convex set. 

(a) If /, g are log-convex on D then max(/, g) is log-convex. Fur- 
thermore, f a g b and af + bg are log-convex for any a, b > 0. 

(b) Let fi : D — > R i = 1, 2, . . . be log-convex. Assume that / := 
limsupj fi : D — > R. Then / is log-convex on D. 
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6.8 Min-max characterizations of p(A) 

Theorem 6.8.1 Let : R — > M &e a differentiable convex nondecreas- 
ing function. Let A G M" x ™, and assume that p{A) > 0. Then 

n / ( A \ 

sup inf NJpi'I'llog " 

p=(p 1 ,...,p„) T eiT„ x =( x i'-> x ")>°~^ V X i 

(6.8.1) *(logp(A)). 

Suppose that ^f' (log p(A)) > and A has a positive eigenvector u which 
corresponds to p(A). If 

(6.8.2) inf VVtfflog^) =*(logp(A)) 

x=(i 1 ,...,i„)>0^ \ Xi / 

i/ien i/ie vector v = pou 1 is a nonnegative eigenvector of A T correspond- 
ing to p(A). In particular, if A is irreducible, then p satisfying (6.8.2) is 
unique. 

Proof. Let p(A) be the left-hand side of (6.8.1). We first show that 
p(A) < *i?(logp(A)). Suppose first that there exists u > such that Au = 
p(A)u. Then 

inf J>* Lg ^i) < (log {A ^) = *(Iogp(A)), 

for any p £ 77„. Hence < *(logp(A)). 

Let J„ e M" x ™ be the matrix whose all entries are equal to 1. For e > 
let A(e) := A + eJ n . As A(e) is positive, it has a positive Perron-Frobenius 
eigenvector. Hence p(A(e)) < (log p(A(e))) . Since \f is nondecreasing and 
A(e) > A, it follows that p(A) < p(A(e)) < ^ (log p(A(e))) . Let s \ 0, 
and use the continuity of 'l'(i) to deduce p(A) < ^(logp(A)). 

Assume now that A £ M" x ™ is irreducible. Let u,v > be the the 
right and the left Perron-Frobenius eigenvectors of A, such that p* = 
(p*, . . . ,p*) T := u o v £ TI n . Suppose first that *S?(t) = t. Theorem 6.6.9 
yields the equality 

min p* log ^ X ^' = log p( A) . 

x=(x 1 ,...,x n )T >0 f-' i ^ b Xt 

Hence we deduce that p(A) > log p(A). Combine that with the previous 
inequality p(A) < log p(A) to deduce (6.8.1) in this case. 
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Suppose next that W is a convex differentiable nondecreasing function 
on E. Let s := #'(logp(A)). So s > 0, and 

(i) > * (log /9(A)) + (i - log p(A))s, for any f e E. 



Use the equality (6.8.1) for $(t) = i to deduce that p(A) > *(log p(A)). 
Combine that with the inequality p(A) < \I/(log p(A)) to deduce (6.8.1) for 
any irreducible A. 

Suppose next that A e M" xn is reducible and p(A) > 0. By applying 
a permutational similarity to A, if necessary we may assume that A = 
[ciij] and B = [a i3 ]™ J=1 G M mxm ,l < m < n is an irreducible submatrix 
of A with p(B) = p(A). Clearly, for any x > 0, {A{x\, . . . , x n ) T )i > 
(B(xi, . . . , x m ) T )i for i = 1, . . . , m. Since ^ is nondecreasing we obtain the 
following set of inequalities 



Use the equality p(A) = p(B) and the inequality p(A) < ^(logp(A)) to 
deduce the theorem. 
Assume now that 



and equality (6.8.2) holds. So the infimum is achieved at x = u. Since 
x = u is a critical point we deduce that A T p o u _1 = p(A)p o u _1 . If A is 
irreducible then p is unique. 



Thus 





sup 




p(A) > 0, *'(log p(A)) > 0, Au = p(A)u, u > 0, 



□ 



Corollary 6.8.2 Let A E M" x ™. Then 



(6.8.3) 




326 



CHAPTER 6. NONNEGATIVE MATRICES 



Suppose that p(A) > and A has a positive eigenvector u which corresponds 
top(A).If 

(6.8.4) inf f>(^=p(A) 

then the vector v = pou 1 is a nonnegative eigenvector of A T correspond- 
ing to p(A). In particular, if A is irreducible, then p satisfying (6.8.3) is 
unique. 

Proof. If p(A) > the corollary follows from Theorem 6.8.1 by letting 
ty(t) = e*. For p(A) = apply the corollary to A\ = A + I to deduce the 
corollary in this case. □ 



Theorem 6.8.3 LetT> n , + denote the convex set of allnxn nonnegative 
diagonal matrices. Assume that A G ]R" X ™. Then 

(6.8.5) p(A + tD x + (1 - t)D 2 ) < tp(A + D x ) + (1 - t)p(A + D 2 ) 

for t G (0, 1) and Di,D 2 G T> n ,+ - If A is irreducible then equality holds if 
and only if D\ — D 2 = al. 

Proof. Let 0(p) = inf x>0 E? =1 Pi^ for p € /7„. Since ((A +f )x) ' = 
rfj + for Z) = diag(di, . . . , d n ) we deduce that 

V>(Ap) := inf E Pi ((A + ^ )X)t =EM + ^(P). 

x>0 * — ' * — ' 

i— l z=i 

Thus ^»(£), p) is an affine function, hence convex on 2?„ j+ . Therefore, p(A + 
D) = sup pe77n t/j(D,p) is a convex function on T> n + . Hence (6.8.5) holds 
for any t G (0, 1) and D\,D 2 G 2?n,+ - 

Suppose that A is irreducible and equality holds in (6.8.5). Since p(A + 
bl + D) = b + p(A + D) for any b > 0, we may assume without loss 
of generality that all the diagonal elements of A are positive. Let Aq = 
A + tDi + (1 — t)D 2 . Since A n has a positive diagonal and is irreducible we 
deduce that A u = ru, Ajv = rv where r > 0, u, v > 0, w := v o u G TI n . 
Corollary 6.8.2 yields that 



p(A ) = ^{tD x + (1 - t)D 2 , w) = t^(D lt w) + (i - t)V(£> 2 ,w). 
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Hence, equality in (6.8.5) implies that 

n 

p(A + D 1 ) = V(£>i, w) = p(A ) + (l - t) - d 3ii ), 

1=1 

n 

i—i 

D 1 =diag(di ) i,...,d„ ) i), D 2 = diag(di )2 , • • ■ , d n>2 ). 

Furthermore, the infima on x > in the ^(£>i,w) and ^(Z?2,w) are 
achieved for x = u. Corollary 6.8.2 that u is the Perron-Frobenius eigen- 
vector of A +Di and A + D 2 . Hence D\ — D 2 = al. Clearly, if D\ — D 2 = al 
then equality holds in (6.8.5). □ 

Theorem 6.8.4 Let A G M" x " be an inverse of an M-matrix. Then 

(6.8.6) P {{tD l + (1 - t)D 2 )A) < tp^A) + (1 - t)p(D 2 A), 

for t G (0,1), Di,D 2 e TJ n!+ . If A > and Di,D 2 have positive diagonal 
entries then equality holds if and only if Di = aD 2 . 

Proof. Let A = B' 1 , where B = rI-C,C e M" x " and p(C) < r. Use 
Neumann expansion to deduce that A = J2°^L r Hence, A > if 
and only if C is irreducible. Assume first that A is positive. Denote by T>° + 
the set diagonal matrices with positive diagonal, i.e. the interior of . 
Clearly, DA > for D e T>° 1+ . Thus, p{DA) > is a simple eigenvalue 
of det (XI — DA). Hence p(DA) is an analytic function on T>° + . Denote 
by Wp(DA) £ M. n the gradient of p(DA) as a function on !?„.+ . Since 
p(DA) e C 2 (D° + it follows that convexity of p(DA) on T>° + is equivalent 
to the following inequality. 

(6.8.7) P (D(d)A) > p(D(d n )A) + V p(D(d )A) T (d - d ), d, d > 0. 
See Problem 1. We now show (6.8.7). Let 

D Au = p(D )u,v T D A = p(L> A)v T ,u,v > o, v o u e 7T„, L> = D(d ). 

Theorem 6.7.4 implies that Vp(D A) = ( o(Z? A)v o u o d" 1 . Hence, (6.8.7) 
is equivalent to 

p(D(d)A) > p(D A)(v o u) T (d o d" 1 ). 

Let D(d)Aw = p(D(d)A)w,w > 0. Then the above inequality follows 
from the inequality 

(6.8.8) p(DqA)- 1 > (v o u) t (w o (DoAw)- 1 ). 
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This inequality follows from the inf sup characterization of p(DqA)^ 1 . See 
Problem 2. The equality case in (6.8.7) follows from the similar arguments 
for the equality case in (6.8.6). Since p{DA) is a continuous function on 
£>„.+, the convexity of p(DA) on 2?° + yields the convexity of p(DA) on 

Consider now the case where A^ 1 = rl - B,B e M™ x ",r > p(B), and 
B is reducible. Then there exists b > such that for p(B + 61 T 1) < r. 
For e e (0, b) let A{e) := {rl — (B + ell 1 "))- 1 . Then the inequality (6.8.7) 
holds if A is replaced by A(e). Let e \ to deduce (6.8.7). □ 



Problems 

1. (a) Let / e C 2 (a, b). Show that f"(x) > for x £ (a, b) if and only 
if f(x) > f(xo) + f'(xo)(x — xo) for each x, xq G (a, 6). 

(b) Let D C M. n be an open convex set. Assume that / € C 2 (D). 
Let H(/)(x) = [g^r](x) G S(n,R) for x e D. Show that 

H(/)(x) >0forallx e D if and only if /(x) > /(x )+V/(x ) T (x- 
x Q ) for all x, x D S D. (Hint: Restrict / to an interval (u, v) C D 
and use part (a) of the problem.) 



2. 



(a) Let F e M" xn be an inverse of an M-matrix. Show 
1 



inf sup > Pi 



p(F) p=(p 1 ,.., P „)T e ff„ x=(liv .. iI(i)>0 ^ (^x)i' 

Hint: Use Corollary 6.8.2. 

(b) Let < F e M" xn be an inverse of an M-matrix. Assume that 

Fu = p(F)u, F T v = p(F)v, u, v > 0, v o u e IJ n . 

Show 

1 \ ^ Xi 

P(F) x=(!, x„)>ofJ (-"Of 

Furthermore, = (7^)~ f° r x > if an d on ly if 

Fx = p(F)x. 

(c) Show (6.8.8). Hint: Use Corollary 6.6.4 to show that A^ 1 D^ 1 
is an M-matrix. 
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3. Let P = [fljy+i)] £ P n be a cyclic permutation matrix. Show that 
p(D(d)P) = (nr =1 *)» for any d S R™. 

4. Show that (6.8.7) does not hold for all A e M+ Xn . 

5. Let A e M" x ™ be an inverse of an M-matrix. Show that the convexity 
of p(D(e x )A) on R n is implied by the convexity of p(DA) on T> n + . 
Hint: Use the generalized arithmetic-geometric inequality. 

6.9 Application to cellular communication 

6.9.1 Introduction 

Power control is used in cellular and ad-hoc networks to provide a high 
signal-to-noise ratio (SNR) for a reliable connection. A higher SNR also 
allows a wireless system that uses link adaptation to transmit at a higher 
data rate, thus leading to a greater spectral efficiency. Transmission rate 
adaptation by power control is an active research area in communication 
networks that can be used for both interference management and utility 
maximization [Sri03]. 

The motivation of the problems studied in this section comes from max- 
imizing sum rate, (data throughput), in wireless communications. Due to 
the broadcast nature of radio transmission, data rates in a wireless network 
are affected by interference. This is particularly true in Code Division Mul- 
tiple Access (CDMA) systems, where users transmit at the same time over 
the same frequency bands and their spreading codes are not perfectly or- 
thogonal. Transmit power control is often used to control signal interference 
to maximize the total transmission rates of all users. 

6.9.2 Statement of problems 

Consider a wireless network, e.g., cellular network, with L logical trans- 
mitter/receiver pairs. Transmit powers are denoted as pi,...,pl- Let 
p = (pi, . . . ,Pl) T > be the power transmission vector. In many situ- 
ation we will assume that p < p := (p 1 , . . . ,pi) T , where pi is the maximal 
transmit power of the user /. In the cellular uplink case, all logical re- 
ceivers may reside in the same physical receiver, i.e., the base station. Let 
C = [9ij]ij=i > OixL representing the channel gain, where gij is the chan- 
nel gain from the jth transmitter to the ith receiver, and n; is the noise 
power for the Ith receiver be given. The Signal-to-Interference Ratio (SIR) 
for the Ith receiver is denoted by 7; = 7z(p). The map p 7(p) is given 
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by 

(6.9.1) 

7z(p) : = ^ 9UVl , . l=i,---,L, 7(p) = (7i(p),- ■• ,7l(p)) T - 

2^ j7 ti 9ijPj + ni 

That is, the power pi is amplified by the factor gu, and diminished by other 
users and the noise, inversely proportional to Y^j^i 9ijPj + n i- 
Define 



(6.9.2) F = [fij]f j = i, where f ij = 



0, if i = j 



and 



(6.9.3) g= (g xl , . . . ,5ll) T , n= (n 15 . . . , n L ) T , 
s= Si,...,s L := , . 

gll g22 gLL 

Then 

(6.9.4) 7 (p)-po(Fp + s)- 1 . 

Let 
(6.9.5) 

L 

$ w(7) == X] u,il0 S( 1 + whcre w = • • ■ ,w„) T e 7T„,7 € 



I*. 



The function $ w (7(p)) is the sum rate of the interference-limited channel. 

We can study the following optimal problems in the power vector p. The 
first problem is concerned with finding the optimal power that maximizes 
the minimal SIR for all users: 

(6.9.6) max min 7/(p) 

pe[o,p]»e<£) 

Then second, more interesting problem, is the sum rate maximization 
problem in interference-limited channels 

(6.9.7) max * w (7(p))- 

P e[o,p] 

The exact solution to this problem is known to be NP-complctc [Luo08]. 
Note that for a fixed pi,. . . ,pi-i,pi+i, ■ ■ ■ ,Pl each 7j(p), j ^ ! is a de- 
creasing function of pi, while 7/(p) is an increasing function of I. Thus, if 
wi = we can assume that in the maximal problem (6.9.7) we can choose 
Pi = 0. Hence, it is enough to study the maximal problem (6.9.7) in the 
case w > 0. 
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6.9.3 Relaxations of optimal problems 

In this subsection, we study several relaxed versions of (6.9.6) and (6.9.7). 
We will assume first that we do not have the restriction p < p. Let 7(p, n) 
be given by (6.9.1). Note that since n > o we obtain 

7(fp, n) = 7(p, ^n) => -y(tp, n) > 7(p, n) for t > l. 

Thus, to increase the values of the optimal problems in (6.9.6) and (6.9.7), 
we let t — > oo, which is equivalent to the assumption in this subsection that 
n = 0. 

Theorem 6.9.1 Let F E R+ xL ,L > 2 be a matrix with positive off- 
diagonal entries. Let Fu = p(F)u for a unique < u G LI^. Then 

(6.9.8) max min — =— — = } - , 

which is achieved only for p = u. In particular, The value of the optimal 
problem given in (6.9.6) is less than ^py- 

Proof. Clearly, the left-hand side of 6.9.8 is equal to (min < p max^^j )~ 1 ■ 
Since F is irreducible, our theorem follows from Problem 11. 

Clearly 7(p, n) < 7(p, 0). Hence, for p > min ;e(L) 7(p, n) < min ;e(L) 7(p, 0). 
Since p E [0, p] C IR+ we deduce that the value of the optimal problem given 
in (6.9.6) is less than ^py- n 

We now consider the relaxation problem of (6.9.7). We approximate 
log(l + x) by log a; for x > 0. Clearly, log(l + x) > logx. Let 

L 

(6.9.9) * w (7) = 5^w j log7,-, 7= ( 7 i,..., 7L ) T . 

j=i 

Theorem 6.9.2 Let F = [fy] E R^ xL have positive off-diagonal ele- 
ments and zero diagonal entries. Assume that L > 3, w = (w 1; . . . , w^) T > 
0, and suppose that w satisfies the inequalities (6.6.10) for each j E (L), 
where n = L. Let D x = diag(di i i, . . . , d L ,i), D 2 = diag(di ;2 , . . . , d L $) 
be two diagonal matrices, with positive diagonal entries, such that B = 
D 1 FD 2 ,B\ = l,B T w = w.(As given by Theorem 6.6.12.) Then 

L 

(6.9.10) max^ w (p) = wj log dj^dj^. 
Equality holds if and only if p — tD" 1 ! for some t > 0. 
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Proof. Let p = Z? 2 x. Then 

L L 



Use Theorem 6.6.9 to deduce that the above expression is not more than 
the right-hand side of (6.9.10). For x = 1 equality holds. From the proof of 
the second part of Theorem 6.6.12 it follows that this minimum is achieved 
only for x = tl, which is equivalent to p = tD^ 1 !. □ 



6.9.4 Preliminary results 

Claim 6.9.3 Let p > be a nonnegative vector. Assume that 7 (p) is 
defined by (6.9.1). Then /o(diag(7(p)).F) < l, where F is defined by (6.9.2). 
Hence, for 7 = 7(p), 

(6.9.11) p = P(j) := (/ - diag( 7 )F)- 1 diag( 7 )v. 
Vice versa, if 7 is in the set 

(6.9.12) T := {7 > 0, p(diag(7)F) < 1}, 

then the vector p defined by (6.9.11) is nonnegative. Furthermore, 7(P(p)) = 
7. That is, 7 : — > T, and P : F — > are inverse mappings. 

Proof. Observe that (6.9.1) is equivalent to the equality 

(6.9.13) p = diag(7)F p + diag(7)v. 

Assume first that p is a positive vector, i.e., p > 0. Hence, 7(p) > 0. 
Since all off-diagonal entries of F are positive it follows that the matrix 
diag(7)F is irreducible. As v > 0, we deduce that maxj £ [ ljn ] ( dla s<~>0 F P)' < 
1. The minmax characterization of Wielandt of p(diag(7)F), [?] implies 
p(diag(7)F) < 1. Hence, 7(p) <G T. Assume now that p > 0. Note that 
Pi > -^=> 7i(p) > 0. So p = 7(p) = 0. Clearly, p(j(0)F) = 

p(olxl) = < 1. Assume now that p > 0. Let A = {i : Pi > 0}. Denote 
7(p)(„4) the vector composed of positive entries of 7(p). Let F(A) be the 
principal submatrix of F with rows and columns in A. It is straightforward 
to see that p(diag(7(p))F) = p(diag(7(p)(^4)F(^4)). The arguments above 
imply that 



p(diag(7(p))F) = p(diag(7(p)(^)F(^)) < 1. 
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Assume now that 7 £ F. Then 

00 

(6.9.14) (7 - diag( 7 )F)- 1 = ^(diag( 7 )F) fe > LxL . 

Hence, P(j) > 0. The definition of P(j) implies that 7(^(7)) =7. □ 

Claim 6.9.4 The set T C is monotonic with respect to the order 
>. That is if 7 e T and 7 > (3 > then (3 e T. Furthermore, the function 
P(7) is monotone on T. 

(6.9.15) P(7) > P(f3) if j e T and 7 > (3 > 0. 

Equality holds if and only if 7 = (3. 

Proof. Clearly, if 7 > (3 > then diag(7)P > diag(/3)P which implies 
p(diag(7)P) > p(diag(/3)P). Hence, T is monotonic. Use the Neumann 
expansion (6.9.14) to deduce the monotonicity of P. The equality case is 
straightforward. □ 

Note that 7(p) is not monotonic in p. Indeed, if one increases only 
the ith coordinate of p, then one increases the zth coordinate of 7(p) and 
decreases all other coordinates of 7(p). 

As usual, let = (S il7 . . . , ^l) T , i= 1, ■ ■ ■ ,L be the standard basis in 
R L . In what follows, we need the following result. 

Theorem 6.9.5 Let I 6 [1, L] be an integer and a > 0. Denote [0, a]i x 
R^ _1 the set of all p = . . . ,Pl) T € satisfying pi < a. Then the 
image of the set [0,a]i x R^ -1 by the map 7 (6.9.1), is given by 

(6.9.16) p(diag(7)(P + KeJ)) < 1, < 7. 

Furthermore, p = (p t , . . . ,pl) € R+ satisfies the condition pi = a if and 
only i/7 = 7(p) satisfies 

(6.9.17) /9 (diag(7)(P+ive^)) = i. 

Proof. Suppose that 7 satisfies (6.9.16). We claim that 7 e T. Suppose 
first that 7 > 0. Then diag(7)(P + iive^) < diag(7)(P + t 2 vej) for any 
ii < t2- Lemma 6.2.4 yields 

(6p$dlgg( 7 )P) < p(diag(7)(P + tiveT)) < p(diag( 7 )(P + t.ve^)) < 
p(diag(7)(P+ -ve^)) < 1 for < t ± < t 2 < -. 
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Thus 7 e r. Combine the above argument with the arguments of the proof 
of Claim 6.9.3 to deduce that 7 e T for 7 > 0. 

We now show that -P(7)z < a. The continuity of P implies that it is 
enough to consider the case 7 > 0. Combine the Perron- Frobenius theorem 
with (6.9.18) to deduce 

(6.9.19) < dct (I -diag{-y){F + tvej)) for t e M" 1 ). 

We now expand the right-hand side of the above inequality. Let B = xy T e 
R LxI be a rank one matrix. Then B has L — 1 zero eigenvalues and one 
eigenvalue equal to y 1 x. Hence, I — xy T has L — 1 eigenvalues equal to 
1 and one eigenvalue is (1 — y T x). Therefore, det (I — xy T ) = 1 — y T x. 
Since 7 e T we get that (J — diag(7)F) is invertiblc. Thus, for any iel 

det (I - diag(7)(P + tvej)) = 

(6.9.20) det {I - diag(7)F)det (7 - t((I - di&g^F)- 1 diag(7)v)e, T ) 
det (I - diag(7)F)(l - tej (I - diag( 7 )F)- 1 diag( 7 )v). 

Combine (6.9.19) with the above identity to deduce 

(6.9.21) 1 > tej(l - diag(7)F)" 1 diag( 7 )v = tP(-f)i for t G [0, a" 1 ). 

Letting t / a -1 , we deduce that P(f)i < a. Hence, the set of 7 defined by 
(6.9.16) is a subset of 7Q0, a]i x R^T 1 ). 

Let p <E [o, a]i x IR^ 1 and denote 7 = 7(p). We show that 7 satisfies 
(6.9.16). Claim 6.9.3 implies that p(diag(7)F) < 1. Since p = P(j) 
and pi < a we deduce (6.9.21). Use (6.9.20) to deduce (6.9.19). As 
p(diag(7)F) < 1, the inequality (6.9.19) implies that p(diag(7)F+iv T e ; ) < 
1 for t e (O,^ 1 ). Hence, (6.9.16) holds. 

It is left to show the condition (6.9.17) holds if and only if P(j)i = a. 
Assume that p = . . . ,pl) T € R+, pi = a and let 7 = 7(p). We claim 
that equality holds in (6.9.16). Assume to the contrary that p(dia,g(~f)(F + 
^ve[)) < 1. Then, there exists (3 > 7 such that p(dia,g(f3)(F+ ^vej)) < 1. 
Since P is monotonic P(f3)i > pi — a. On the other hand, since (3 satisfies 
(6.9.16), we deduce that P(/3)i < a- This contradiction yields (6.9.17). 
Similarly, if 7 > and (6.9.17) then ^(7), = a. □ 



Corollary 6.9.6 Let p = (pi, . . . ,pl) t > be a given positive vector. 
Then 7([0,p]), the image of the set [0,p] by the map 7 (6.9.1), is given by 

(6.9.22) p ^diag(7) + ^-ve^^ < 1, fori = 1,...,L, andjeR 1 ^. 
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In particular, any 7 <G satisfying the conditions (6.9.22) satisfies the 
inequalities 

(6.9.23) 7 < 7 = (71, . . . ,7l) T , w/iere 7 = — , i = 1, . . . ,L. 

Proof. Theorem 6.9.5 yields that 7([0,p]) is given by (6.9.22). (6.9.4) 
yields 

( \ Pl ^ Pi ^ Pi t r n -1 

7i(P = ((v \ , x < - < - for p € 0,p . 

{(Fp)i + vi) vi vi 

Note that equality holds for p = p ; e;. □ 



6.9.5 Reformulation of optimal problems 

Theorem 6.9.7 The maximum problem (6.9.7) is equivalent to the 
maximum problem. 

maximize J^i w i l°g(l + ll) 
(6.9.24) subject to p(diag(7)(F + (l/pfivej)) < 1 VI G (L), 

variables: 7;, VL 

7* is a maximal solution of the above problem if and only if P(7*) is a 
maximal solution p* of the problem (6.9.7). In particular, any maximal 
solution 7* satisfies the equality (6.9.22) for some integer I G 

We now give the following simple necessary conditions for a maximal 
solution p* of (6.9.7). We first need the following result, which is obtained 
by straightforward differentiation. 

Lemma 6.9.8 Denote by 

'M^(i..i) T = ..( 1+l) - 

the gradient of <I> W . Let j(p) be defined as in (6.9.1). Then H(p) = 
[g^r]|Lj=i) the Hessian matrix ofj(p), is given by 

H(p) = diag((Fp + v)-^(- diag( 7 (p))F + I). 

In particular, 



V p <I> w (7(p)) = H(p) T V<M7(p)). 
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Corollary 6.9.9 Let p* = (p*, . . . ,pi) T be a maximal solution to the 
problem (6.9.7). Divide the set (L) = {1,...,L} to the following three 
disjoint sets S max , S; n , So-' 

S max = {i G (L), p* = Pi}, S in = {i e (L), p* G (0,pi)}, So = {i G (L), p* = 
Then the following conditions hold. 

(H(p*) T V^ w ( 7 (p*))) l > o for i G S max , 

(6.9.25) (H(p*) T V^ w ( 7 (p*))), = o for i G S in , 
(H(p*) T V<Z> w (7(P*)))* < o for i G So 

Proof. Assume that p* = pi. Then ^r<I> w (7(p))(p*) > o. Assume 
that < p* < pi. Then g|r$w(7(p))(p*) = o. Assume that p* = 0. Then 
gfr<i>w(7(p))(p*) < o. □ 

We now show that the maximum problem (6.9.24) can be restated as 
the maximum problem of convex function on a closed unbounded domain. 
For 7 = ( 7 i, . . . ,7l) t > let 7 = log 7, i.e. 7 = e^. Recall that for a 
nonnegative irreducible matrix B G K^ xi logp(e x £?) is a convex function, 
Theorem 6.7.4. Furthermore, log(l + e') is a strict convex function in t G M. 
Hence, the maximum problem (6.9.24) is equivalent to the problem 

maximize J^i w i l°g(l + e ^ ) 

(6.9.26) subject to log p(diag(e^)(F + (l/pfivej)) < VI G (L), 
variables: 7 = (71, ... , 7„) T G M L . 

The unboundedness of the convex set in (6.9.26) is due to the identity 
= e-°°. 

Theorem 6.9.10 Lef w > fc a probability vector. Consider the max- 
imum problem (6.9.7). Then any point < p* < p satisfying the conditions 
(6.9.25) is a local maximum. 

Proof. Since w > 0, <J> w (e^) is a strict convex function in 7 G R L . 
Hence, the maximum of (6.9.26) is achieved exactly on the extreme points 
of the closed unbounded set specified in (6.9.26). (It may happen that some 
coordinate of the extreme point are —00.) Translating this observation to 
the maximal problem (6.9.7), we deduce the theorem. □ 



We now give simple lower and upper bounds on the value of (6.9.7). 
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Lemma 6.9.11 Consider the maximal problem (6.9.7). Let Bi = (F + 
(l/p/)ve 1 r )) for I — 1,...,L. Denote R = max^^j p{Bi). Let 7 be defined 
by (6.9.23). Then 

* w ((l/i?)l) < max «P w (7(p) < 
pe[o,p] 

Proof. By Corollary 6.9.6, 7(p) < 7 for p £ [0,p]. Hence, the upper 
bounds holds. Clearly, for 7 = (l/i?)l, we have that p(diag(7)_B ; ) < 1 
for I € (L). Then, from Theorem 6.9.7, $ w ((l/i?)l) yields the lower 
bound. Equality is achieved in the lower bound when p* = tx(Bi), where 
i — argmax; e ( L ) p(Bi), for some t > 0. □ 

We now show that the substitution < p = e q , i.e. pi = e qi ,1 = 
1,. . . ,L, can be used to find an efficient algorithm to solve the optimal 
problem (6.9.6). As in §6.9.3 we can consider the inverse of the maxmin 
problem of (6.9.6). It is equivalent to the problem 
(6.9.27) 

L 

ming(q), g(q) = max sie~ qi + V fi 3 e q ^ qi , q = (logp 1; . . . , logp L ) T - 
q<q ie(L) ^— ' 

Note that Sl e- q ' + J^j=i fi 3 e q ^ qi is a convex function. Fact 6.7.2.5a im- 
plies that g(q) is a convex function. We have quite a good software and 
mathematical theory to find fast the minimum of a convex function in a 
convex set as q < q, i.e. [NoW99]. 

6.9.6 Algorithms for sum rate maximization 

In this section, we outline three algorithms for finding and estimating the 
maximal sum rates. As above we assume that w > 0. Theorem 6.9.10 
gives rise to the following algorithm, which is the gradient algorithm in the 
variable p in the compact polyhedron [0,p]. 

Algorithm 6.9.12 

1. Choose p € [0,p]: 

(a) Either at random; 

(b) or p = p. 

2. Given pk = (f>i,fc, • • • ,PL,k) T € [0, p] for k > 0, compute a = (a 1; . . . , 
Vp$ w (7(Pfc))- If & satisfies the conditions (6.9.25) for p* = p^, then 
p k is the output. Otherwise let b = (6 17 ...,6 L ) T be defined as fol- 
lows. 
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(a) hi = ifpi.k = and a, < 0; 

(b) bi — if Pi.k = Pi and m > 0; 

(c) bi — a t ifQ<p l < pi. 

Then Pfc+i = Pk + ffcb, where t k > satisfies the conditions Pk+i € 
[0,p] and ( f ) w (7(Pfe + ^fcbfe)) increases on the interval [0,t k ]. 

The problem with the gradient method, and its variations as a conjugate 
gradient method is that it is hard to choose the optimal value of tk in each 
step, e.g. [Avr03]. We now use the reformulation of the maximal problem 
given by (6.9.26). Since w > 0, the function $ w (e^) is strictly convex. 
Thus, the maximum is achieved only on the boundary of the convex set 

(6.9.28) D({F}) = {7 e M L , logp(diag(e^)(F + (l/pi)ve, T )) < 0, VI}. 

If one wants to use numerical methods and software for finding the 
maximum value of convex functions on bounded closed convex sets , e.g., 
[NoW99], then one needs to consider the maximization problem (6.9.26) 
with additional constraints: 

(6.9.29) D({F},K) = {^eD({F}), J>-K1}. 

for a suitable f>l. Note that the above closed set is compact and convex. 
The following lemma gives the description of the set D({F} 7 K). 

Lemma 6.9.13 Let p > be given and let R be defined as in Lemma 
6.9.11. Assume that K > log R. Let p = P(e~ K l) = (e K I — _F) _1 v. Then 
D({F},^)Clog 7 ([p,p]). 

Proof. From the definition of K, we have that e K > R. Hence, 
p{e- K Bi) < 1 for I = 1,...,L. Thus -Kl e D({F}). Let j_ = e~ K l. 
Assume that 7 € D({F}, K). Then 7 > —Kl. Hence, 7 = > 7. Since 
p(diag(7)F) < 1, Claim 6.9.4 yields that p = P(~f) > P( 7 ) = p, where 
P is defined by (6.9.11). The inequality P( 7 ) < p follows from Corollary 
6.9.6. □ 

Thus, we can apply the numerical methods to find the maximum of the 
strictly convex function <!> w (e^) on the closed bounded set D({F},K), e.g. 
[NoW99]. In particular, we can use the gradient method. It takes the given 
boundary point -y k to another boundary point of 7 fe+1 € D({F},K), in 
the direction induced by the gradient of <& w (e' Y ). However, the complicated 
boundary of D({F}, K) will make any algorithm expensive. 

Furthermore, even though the constraint set in (6.9.24) can be trans- 
formed into a strict convex set, it is in general difficult to determine precisely 
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the spectral radius of a given matrix [Var63] . To make the problem simpler 
and to enable fast algorithms, we approximate the convex set D({F}, K) by 
a bigger polyhedral convex sets as follows. Choose a finite number of points 
Ci, • ■ ■ , Cm on the boundary of D({F}), which preferably lie in D({F}, K). 
Let 

Hi (£),..., Hjv(£),£ £ R L be the N supporting hyperplanes of D({F}. 
(Note that we can have more than one supporting hyperplane at Ci, and 
at most L supporting hyperplanes.) So each £- £ D({F},K) satisfies the 
inequality Hj(g) < for j = 1, . . . , N. Let 7 be defined by (6.9.23). Define 
(6.9.30) 



£>(Ci,..., t M ,K) = {££R L , -Kl<£<logj, U j (t)<oioTj = i,...,N} 



Hence, D(Ci, . ■ . , Cm, K) 1S a polytope which contains D({F}, K). Thus 



Since $ w (e' 7 ) is strictly convex, the maximum in (6.9.31) is achieved only 
at the extreme points of -D(Ci, ■ • ■ > Cm, K)- The maximal solution can be 
found using a variant of a simplex algorithm [?] . More precisely, one starts 
at some extreme point of £ £ -D(Ci, • ■ • , Cm> ^0- Replace the strictly convex 
function <& w (e^) by its first order Taylor expansion ^ at £. Then we find 
another extreme point J7 of -D(Ci, • • ■ , Cm, sucn that ^^(ij) > = 
< i ) w(e^). Then we replace < &w(e^') by its first order Taylor expansion vf^ at 
rj and continue the algorithm. Our second proposed algorithm for finding 
an optimal 7* that maximizes (6.9.31) is given as follows. 

Algorithm 6.9.14 

1. Choose an arbitrarily extreme point £ £ -D(Ci, • • • , Cm, K)- 

2. Let = * w (e^) + (w o (1 + e^)" 1 o e^f (£ - £ k ). Solve 
the linear program max|$^(^) subject to £ £ -D(Ci, • • • ,Cm,-^0 us ~ 
ing the simplex algorithm in [1] by finding an extreme point £ fe+1 °f 
D(Ci, . . -Xm,K), such that * €fc (£ fc+1 ) > *^(€ fc ) = *w(e c »). 

5. Compute p fe = P(e^ fe + i ). 7/pfe € [o, p], compute a = (a 1; . . . , a^) T = 
Vp^ w (7(Pfe))- ^/a satisfies the conditions (6.9.25) for p* = p^, i/ien 
Pfc is the output. Otherwise, go to Step 2 using v I / £ fc+1 (£)- 

As in §6.9.3, it would be useful to consider the following related maximal 
problem: 



(6.9.31) 



max ^(e 7 ) > 
yeD(c 1 ,...x M ,K) 



(6.9.32) 




(6.9.33) 



7££>(c 1 ,..,CM,fO 



max 



w 7. 
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This problem given by (6.9.33) is a standard linear program, which can 
be solved in polynomial time by the classical ellipsoid algorithm [?]. Our 
third proposed algorithm for finding an optimal 7* that maximizes (6.9.33) 
is given as follows. Then p* = P(e^ ). 

Algorithm 6.9.15 

1. Solve the linear program maxy w T 7 subject to 7 e -D(Ci> • • • , Cm; K) 
using the ellipsoid algorithm in /?/. 

2. Compute p = P(e^). If p G [o,p], then p is the output. Otherwise, 
project p onto [0,p]. 

We note that 7 S -D(Ci7 ■ • • ) Cm> -^) m Algorithm 6.9.15 can be replaced 
by the set of supporting hyperplane D(F, K) = {76 i o(diag(e^)F) < 
1, 7 > — Kl} or, if L > 3 and w satisfies the conditions (6.6.10), 
D(F, K) = {76 p(diag(e^)F) < 1, 7 > -Kl} based on the relaxed 
maximal problems in Section 4. Then Theorem 6.9.2 quantify the closed- 
form solution 7 computed by Algorithm 6.9.15. 

We conclude this section by showing how to compute the supporting hy- 
perplanes Hj,j = l,...,N, which define -D(Ci> • • • , Cm>K)- To do that, we 
give a characterization of supporting hyperlanes of D({F}) at a boundary 
point C € dD({F}). 

Theorem 6.9.16 Let p = (pi, ■ ■ -Pl) T > be given. Consider the 
convex set (6.9.28). Let £ be a boundary point of dD({F}). Then C = 
log7(p), where < p = . . . ,p L ) T < p. The set B := {I E (L), pi = 
pi} is nonempty. For each Bi = (F+(l/pi)veJ)) £eiH;(£) be the supporting 
hyperplane of diag(e x )S/ at £, defined as in Theorem 6.7.4- Then H; < 0, 
for I G B, are the supporting hyperplanes of D({F}) at £. 

Proof. Let p = P(e^). Theorem 6.9.5 implies the set B is nonempty. 
Furthermore, p(e^Bi) = 1 if and only if pi = p { . Hence, £ lies exactly at 
the intersection of the hypersurfaces logp(e^i?;) =0,1 £ B. Theorem 6.7.4 
implies that the supporting hyperplanes of D({F}) at £ are H;(£) < for 



We now show how to choose the boundary points Ci , • • • , Cm *= 9D({F}) 
and to compute the supporting hyperplanes of D({F}) at each Q. Let 
p = P(e~ K l) = (p 1; . . . ,pl) T be defined as in Lemma 6.9.13. Choose 
Mi > 2 equidistant points in each interval [p^Pi]- 



I G B. 



□ 



(6.9.34) Pjiii = 



jiP. + (Mi - ji)pi 
~M~i 



for ji = 1, . . . , Mi, and i = 1, . . . , L. 
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P = {P- 3 . : = (Pjx.i, • • • .Pji.,i) T . minfo -p jltl ,...,p L - p jL , L ) = o}. 

That is, p, € "P if and only Pj lt ...,j L -A P Then 

{Ci,---,CM} = iog7(n 

The supporting hypcrplancs of D({F}) at each £ 4 are given by Theorem 
6.9.16. 
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Chapter 7 

Convexity 



7.1 Convex sets 

In this chapter all vector spaces are finite dimensional. 

Definition 7.1.1 Let V be a finite dimensional vector space over F = 
E,C. 

1. For x,y G V denote 

[x, y] := {z : z = ax + (1 — a)y for all a G [o, 1]}, 
(x, y) := {z : z = ax + (1 — a)y for all a G (o, 1)}. 

[x,y], (x,y) are called closed and open intervals respectively, with the 
end point x,y. 

2. For a nonempty S C V denote conv S = U x , ye s[ x jy]; called the 
convex hull of S. (conv = $.) 

3. A set C C V is called convex if for each x,y <E V [x,y] C C. ($ is 
convex.) 

4- Assume that C C V is a convex set and let x <E C. Denote by C— x the 
set {z : z = y — x, y G C}. Let U = span R (C — x), i.e. the set of all 
linear combinations of elements of C — x with real coefficients. Then 
U is a finite dimensional real space. The dimension of C, denoted by 
dim C , is the dimension of the vector space U. (dim = — 1.) C — x 
has interior as a subset of 15, which is called the relative interior and 
denoted by ri (C — x). Then the relative interior of C is defined as 
ri C equal to ri (C — x) + x. 
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5. A point x in a convex set C is called an extreme point if there are no 
two points y,z e C\{x} such that x <E [y, z]. Denote by £(C) the set 
of the extreme points of the convex set C. 

6. Let C be a convex set and£(C) the set of its extreme points. For k dis- 
tinct extreme points Xj , . . . , x k e £ (C) the convex set convjXi , . . . , x fe } 
is called the k-face of C if the following property holds. Let x, y G 
C and assume that (x,y) n conv{x 1; . . . , x^} ^ 0. Then [x,y] C 
conv{x l7 . . . ,x fe }. 

7. Let C be a convex set in a finite dimensional vector space V. For 
f G V* and x G V denote 



H (f,x) is called the (real) hyperplane, H + (f,x),H_(f, x) are called 
the upper and the lower half spaces respectively, or simply the half 
spaces. 

It is straightforward to show that dim C, ri C do not depend on the 
choice of x G C. Furthermore ri C is convex. See Problem 3 or [Roc70]. 

Assume that V is a complex finite dimensional subspacc, of dimension 
n. Then V can be viewed as a real vector space Vr of dimension In. A 
convex set C C V is a convex set Cr C Vr. However, as we see later, 
sometimes it is natural to consider convex sets as subsets of complex vector 
space V, rather then subsets of Vr. 

Clearly, H (f,x),H + (f, x),H_(f,x) are convex sets. Note also that 



Definition 7.1.2 An intersection of a finite number of half spaces 
n-^ 1 H + (fj,Xj) is called a polyhedron. A nonempty compact polyhedron is 
called polytope. 

Clearly, a polyhedron is a closed convex set. Given a polyhedron C, 
it is a natural problem to find if this polyhedron is empty or not empty. 
The complexity of finding out if this polyhedron is empty or not depends 
polynomially on: the dimension of V, and the complexity of all the half 
spaces in the characterizing C. This is not a trivial fact, which is obtained 
using an ellipsoid method. Sec [Kha79, Kar84, Lov86]. 

It is well known that any polytope has a finite number of extreme points, 
and is equal to the convex hull of its extreme points [Roc70, pT2]. The 
following result is a generalization of this fact [Roc70, Part IV, §17-18]. 



H (f,x) 
H+(f,x) 
H_(f,x) 



{y e V, Hf(y) = 3tf (x)}, 
{y G V, Hf(y) > Ef(x)}, 
{y e V, sftf(y) < 3tf (x)}. 



H_(f,x) = H+(-f,x). 
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Theorem 7.1.3 Let C be a compact convex set in a finite dimensional 
vector space V of dimension d. Then £ := £{C) is nonempty set and 
C = conv £ . Furthermore for each x E C there exists at most d+1 extreme 
points x l7 ...,Xfe E £, k < d + l, such that x S convjxi, . . . , x^}. 

In general it is a difficult problem to find explicitly all the extreme 
points of the given compact convex set, or even of a given polyhedron. The 
following example is a classic in matrix theory. 

Theorem 7.1.4 Let H n>+) i C C nx " be the convex set of nonnegative 
definite hermitian matrices with trace 1. Then 

(7.1.1) £(H„,+,i) - {xx*, x e C",x*x = i}, 

(7.1.2) £(H n ,+,inS(n,R)) ={xx T , x e M n , x T x = i}. 

Each matrix in H„ ;+j i or H„ ;+j i n S(n,R) is a convex combination of at 
most n extreme points. 

Proof. Let A = xx*,x e C",x*x = l. Clearly A E H„ ;+i i. Suppose 
that A = aB + (1 — a)C for some B,C € H„ ;+i i and a E (0,1). Hence 
A I>= aB >p 0. Since y* Ay > ay*By > o it follows that y*By = o for 
y*x = o. Hence By = for y*x = o. Thus B is a rank one nonnegative 
definite matrix of the form fcxx* where t > 0. Since tr B = 1 we deduce 
that t = 1 and £> = ^4. Similarly C = A. Hence A is an extremal point. 

Let F E H„ ;+j i. Then the spectral decomposition of F yields that 
F = Y^i=i ^i x i x i i where x*Xj = 5ij,i,j — l, . . . ,n. Furthermore, since F 
is nonnegative definite of trace 1, Ai, . . . , A„, the eigenvalues of F, are non- 
negative and sum to 1. So F E convjXiX*, . . . ,x„x*}. Similar arguments 
apply to nonnegative real symmetric matrices of rank 1. □ 

Definition 7.1.5 Let C\,Ci C V, where V is a finite dimensional 
vector space over F. C\,Ci are called hyperplane separated if there exists 
feV and x E V such that C x C H + (f,x),C 2 C H_(f,x). H (f,x) is 
called the separating (real) hyperplane. H (f,x) is said to separate C\ and 
C2 properly z/H (f,x) separates C\ and C2 and H (f,x) does contain C\ 
and C2. 

The following result is well known [Roc70, Theorems 11.3]. 

Theorem 7.1.6 Let C\, C2 be nonempty convex sets in a finite dimen- 
sional vector space V. Then there exists a hyperplane separating C\ and 
C2 properly if and only ri C\ n ri C 2 = 0- 
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Corollary 7.1.7 Let C\ be a compact convex set in a finite dimensional 
vector space V over F = R, C. Assume that C\ contains more than one 
point. Let x be a an extreme point of C . Then there exists a hyperplane 
which supports properly C\ at x. I.e., there exists ^ / S V*, such that 
3^/( x ) < 5R/(y) f or eac h y € C. Furthermore, there exists y G C such that 
R/(x)<K/(y). 

Proof. Let C2 = {x}. So C2 is a convex set. Problem 4 yields that 
ri Ci n ri C 2 = 0. Use Theorem 7.1.4 to deduce the Corollary. □ 



Definition 7.1.8 A point x of a convex set C in a finite dimensional 
vector space V is called exposed, if there there exist a linear functional 
feV* such that Sftf(x) > Sftf(y) /or any y G C\{x}. 

Clearly, an exposed point of C is an extreme point, (Problem 5). There 
exist compact convex sets with extreme points which are not exposed. See 
Problem 6. In what follows we need Straszewiz [Str35]. 

Theorem 7.1.9 . Let C be a closed convex set. Then the set of exposed 
points of C is a dense subset of extreme points of C Thus every extreme 
point is the limit of some sequence of exposed points. 

Corollary 7.1.10 Let C be a closed convex set. Let x G C be an iso- 
lated extreme point. (I.e. there is a neighborhood o/x, where x is the only 
extreme point of C) Then x is an exposed point. 



Problems 

1. Show 

(a) For any nonempty subset S of a finite dimensional vector space 
V over F, conv S is a convex set. 

(b) Furthermore, if S is compact, then conv S is compact and £{C) C 
C. 

2. Let C be a convex set in a finite dimensional subspace, with the set 
of extreme points £{C). Let E\ C £{C) and C\ = conv-Ei. Show 
that £(Ci)=E 1 . 
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3. Let V be a finite dimensional space and C C V a nonempty convex 
set. Let x e C. Show 

a. The subspace U := span (C — x) docs not depend on x e C. 

b. C — x has a nonempty convex interior in U 

4. Let C be a convex set in a finite dimensional vector space V. Assume 
that C contains at least two distinct points. Show 

(a) Show that dim C > 1. 

(b) Show that ri CC)£(C) = 0. 

5. Let x e C be an exposed point. Show that x is an extreme point of 
C. 

6. Consider the convex set CeM 2 , which is a union of the three convex 
sets: 

d := {(x,y) T , |x| < 1, |y| < 1}, C 2 = {(x,y) T , (x - l) 2 + y 2 < 1}, 

C 3 = {(x,y) T ,(x+l) 2 +y 2 <l}. 

Show that C has exactly 4 extreme points (±1,±1) which are not 
exposed points. 

7.2 Doubly stochastic matrices 

Definition 7.2.1 A e M™ xn is called doubly stochastic matrix if the 
sum of each row and column of A is equal to 1. Denote by f2„ C IR™ X ™ the 
set of doubly stochastic matrices. Denote by -J n the nxn doubly stochastic 
matrix whose all entries are equal to -, i.e. J n € M" x ™ is the matrix whose 
each entry is 1. 

Definition 7.2.2 P e ]R" X ™ is called a permutation matrix if each row 
and column of P a contains exactly one nonzero element which is equal to 
1. Denote by V n the set of n x n permutation matrices. 

Lemma 7.2.3 The following properties hold. 

1. A e M" x ™ is double stochastic if and only if Al = A T 1 = 1, where 
1 = (i,...,i) T e R n . 

2. fti = {1}. 

3. A, B e fi„ => tA + (1 - t)B e Ct n for each t e [0, 1]. 



348 



CHAPTER 7. CONVEXITY 



4. A,Be£l n ^ AB <e fi„. 

5. V n is a group with respect to the multiplication of matrices, with I n 
the identity and P^ 1 = P T . 

7. AgQi, Ben m ^A®Be n t+m . 

See Problem 1. 

Theorem 7.2.4 The set Cl n is a polytope of dimension (n— I) 2 , whose 
extreme points is the set of permutation matrices V n . 

Proof. Clearly, fi n is a nonempty compact convex set in R raxn . Q, n is 
a polytope since it is intersection of An + n 2 half spaces 

n n n n 

(7.2.1) ^2 x kj > 1, ^2 ~ x kj > -I. X] x jk > 1, X ~ x j fe - -1 ' 
fe=l fc=l fc=l fe=l 

> 0, i = 1, . . . ,n, j = 1, . . . ,n, 

where X = [^]^ =1 . 

Let Q n ,o = SI — {-^Jn}, i-c f2 n .o 1S the set of all matrices of the form 
A — ^J n , where A e fi„. Denote 

(7.2.2) AT n = {Xe E" xn , XI = X T 1 = 0}. 

Clearly A"„ is a subspacc of M" x ™, which contains fi„,o. Let Q e R nxn 
be an orthogonal matrix whose first column is the vector -4=1. We claim 

that X G X n if and only if Q T XQ = [0]$y for some Y G R("-i)x("-i). 
Indeed, observe that Z = [0] © y if and only if Ze 1 = Z T e ± = 0, where 
e ± = (1,0,..., o) T G E". Clearly, = -2=1, hence Q T XQ = [0] © Y 

if and only if XI = X T 1 = 0. So = Q([0] © E^" 1 )^"" 1 ))^, hence 
dim Af„ = dim R(«-i)x(«-i) = ( n - l) 2 . 
Let 

(7.2.3) B(0, r) = {X G R" XI \ tr X T X < r 2 } 

be the closed ball of radius r in Frobenius, i.e. Euclidean norm, in M. nxn 
centered in the origin. We claim that B(0, -) n X n C Q n ,o- Indeed, let 
X = [xij]? =j=1 G B(0,£) n Then |^|"< ± and XI = X T 1 = 0. 
Let A = X + \3, n . So X > and Al = A T 1 = 1 and A G fi„. Hence 
dim n n = (n — l) 2 . 
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We next observer that any permutation matrix P = \jpij] <G V n is an 
extreme point of Q n . Indeed, assume that P = tA + (1 — t)B for some 
A = [ttij], B = [bij] £ fi n and t £ (0, 1). Hence, = b^ = if = 0. So 
A and B have at most one nonzero clement in each row i. As A, B € £l n 
it follows that = bij = Pij if = 1. Thus A = B = P and P is an 
extremal point. 

It is left to show that £(Q n ) = Vn- This is equivalent to the statement 
that A e M" x ™ is doubly stochastic if and only 

(7.2.4) A= ^2 apP for some a P >0, P eV n , a P = 1. 

We now show by induction on n that any A S f2„ is of the form (7.2.4). For 
n = 1 the result trivially holds. Assume that the result holds for n = m — 1 
and assume that n — m. Let A = (a^) e Q n . Denote by 1(A) be the 
number of nonzero entries of A. Since each row sum of A is 1 it follows 
that 1(A) > n. Suppose first 1(A) < 2n — 1. Then there exists a row i of 
A which has exactly one nonzero element, which must be 1. Hence there 
exists i,j € (n) such that a^- = 1. Then all other elements of A on the 
row i and column j are zero. Denote by A tj e 1)x( ™ 1} the matrix 
obtained from yl by deleting the row and column j. Clearly A^j G £l n —\. 
Use the induction hypothesis on Aij to deduce (7.2.4), where ap = if the 
entry (i,j) of P is not 1. 

We now show by induction on 1(A) > 2n — 1 that A is of the form 
(7.2.4). Suppose that any A e fi„ such that /(A) < 1—1,1 > 2n is of 
the form (7.2.4). Assume that Z(A) = L Let 5 C (n) x (n) be the set 
of all indices (i,j) G (n) x (n) where > 0. Note #5 = > 2n. 

Consider the following system of equations in n 2 variables, which are the 
entries X = (a^)^ =1 e M nx ": 

n n 

^ ' S ' Xji 0, i 1, . . . , n. 

Since the sum of all rows of X is equal to the sum of all columns of X 
we deduce that the above system has at most 2n — 1 linear independent 
equations. Assume furthermore the conditions x t j ~0 for (i,j) S. Since 
we have at least 2n variables it follows that there exist X ^ nx „ satisfying 
the above conditions. Note that X has zero entry in the places where A 
has zero entry. Furthermore, X has at least one positive and one negative 
entry. Therefore the exists b, c > such that A — bX, A + cX e Q n and 
l(A - bX),l(A + cX) < I. So A - bX, A + cX are of the form (7.2.4). As 
A = ^(A-bX) + ^(A + cX) we deduce that A is of the form (7.2.4). □ 
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Definition 7.2.5 Let 

R\ := {x = (x 1 ,...,x n ) T e E" : x 1 > x 2 > ... > x n }. 

For x = x n ) T e E™ let x = (xi, £„) T e be the unique 

rearrangement of the coordinates of x in a decreasing order. That is there 
exists a permutation ir on {1, n} suc/i that Xi — x^a), % = 1, 

Le£ x = x„) T , y = ...,y n ) T e E™. T/ien x is weakly ma- 

jorized by y (y weakly majorizes x), which is denoted byx<y, if 

k k 

(7.2.5) £«i<£i/i. fc=l,-,n- 
»=i i=i 

x is majorized by y (y majorizes x), which is denoted by x <y, if x -< y 
and Ei=i a; i = Ei=iW- 

Theorem 7.2.6 . For y e E™ fei 

(7.2.6) M(y) :={xeR",x^y}. 

TTien .M(y) is a polyhedron whose extreme points are Py for all P G V n . 
In particular, x -< y i/ and onZy «/ i/iere exists A G il n suc/i £/ia£ x = Ay. 

Proof. Observe first that x = (x ± , . . . , x n ) T -< y = (y 1 , . . . , y n ) T is 
equivalent to the following conditions 

n n 

(7.2.7) = = 

i=l i=l 

fe fe 

(7.2.8) ^(Px) 4 < ^ j/j, fc = l, . . . ,n - l, for each P e P„. 

i—1 i=i 

Clearly, A4(y) is a closed convex set. Also Xi < yi for % = 1, . . . , n. Hence 

n n 

^ = % + XI ~ - ~( n ~ + X 

.3=1,7^ .7=1 

Thus A^(y) is a compact convex set containing Qy for all Q £ V n - Hence 
A4(y) is a poly tope. 
Clearly 

(7.2.9) PM(Qy) = M(y) for each P,Q e 7>„. 
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Assume that x is an extremal point of M(x). Then the above equality 
implies that Px is also an extreme point of A4(x) for each P G V n . Let 
Px = x, Qy = y for some P, Q G V n . We claim that x = y. Without loss 
of generality we consider the case that x = x,y = y. We prove this claim 
by induction on n. For n = 1 this claim is trivial. Assume that this claim 
holds for any m < n — 1 . 

Let m — n. Assume to the contrary that x^y. Suppose first that for 
some KKn-lwc have the equality X)i=i x i = Sj=i Vi- Let 

Xi = (x 1 ,...,x k ) T ,y 1 = (y l7 ...,y k ) T G R k , 
x 2 = (x k+l7 . . .,x n ) T ,y 2 = (y k +i, ■ ■ ■ ,2/«) T G R n ~ k . 

Then x t -< yi,x 2 -< y 2 . Use the induction hypothesis that = y it i= 1,2. 
Hence x = y contrary to our assumption. 

It is left to consider the erase where strict inequalities hold in (7.2.5) for 
k = 1, . . . ,n — 1. In particular yi > y n and yi > X\,x n > y n . Assume 
first that X\ = ... = x n . Then x = ^ i=1 ^[Py an d x can not be an 
extremal point in M.(y) contrary to our assumption. Hence, there exists 
and integer k G [1, n — 1] such that x\ = . . . = x k > x k +i- For t G M define 
x(t) = (x 1 (t), . . .,x n (t)) T , where 

k 

xAi) = Xi + t for i = 1, . . . , k, xAt) = x^ -t for i = k + 1, . . . , n. 

n — k 

It is straightforward to see that there exists e > such that for each 
t G [e, -e\ x(t) G n A4(y). Asx= |x(e) + fx(-e) we deduce that x 
is not an extremal point, contrary to our assumption. Hence £{M.(y)) = 

Let x G A4(y). Then 

(7.2.10) x = a p( p y) = ( E a p p )y^ 

Pev n Pev n 

where ap > 0, P G V n , ^ ap = 1. 

Pev n 

Hence x = Ay for a corresponding A G Vice versa, any A G £l n is a 
convex combination of the permutation matrices. Therefore Ax G A4(y) 
for any doubly stochastic matrix. □ 



Problems 

1. Prove Lemma 7.2.3. 
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2. Let x,y G K™. Show that x -< y -y -< -x. 

3. Let to, n G N. Denote by fi m ,„ C M™ x ™ the set of stochastic matrices, 
i.e. each row is a probability vector, such each column has sum ^. 
Assume that to ^ n. Show 

(a) O mj „ is a polytope. 

(b) dim f2 TO; „ = (to - l)(n - 1). 

(c) Each extreme point of £l mt „ at most m + n—1 nonzero elements. 

4. Let x = . . . , x n ) € M". Recall that one needs O(nlogn) swaps 
to obtain the coordinates of x. Deduce that for a given x,y G R" 
one needs 0(n log n) swaps and 2n 2 additions of entries of x and y to 
determine if x is or is not in the set A4(y). 

7.3 Convex functions 

Definition 7.3.1 Let C be a convex set in a finite dimensional subspace 

V over F = K, C. A function <j) : C — > M. is ca//ed convex if for any x, y G C 
and t e [0, 1] 

(7.3.1) </>(ix + (i - t)y) < ^(x) + (i - t)(j>{y). 

(j) is called strictly convex on C if for any x,y e C,x^y and t G (0, 1) 
s£ric£ inequality holds in (7.3.1). A function ip : C — > K is called concave or 
strictly concave if the function —ip is convex or strictly convex respectively. 

We remark that in [Roc70] a convex function on <fi on a convex set is 
allowed to have the values ±oo. To avoid the complications, we restrict our 
attention to convex function with finite values. The following result is well 
known [Roc70, Theorem 10.1] 

Theorem 7.3.2 Let C be a convex set in a finite dimensional subspace 

V over F = R, C. Assume that <f> : C — > R is convex. The <p ■ ri C — > R is 
continuous. 

For any set T € V we let CI T be the closure of T in the standard topology 
in V (which is identified with the standard topology of M dlm kV ). 

Proposition 7.3.3 Let C C V be convex. Then C1C is convex. As- 
sume that ri C is an open set in V and f G C°(C1C). Then f is convex in 
CI C if and only if f is convex in C. 



Sec Problem 3. 
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Theorem 7.3.4 Let C be a compact convex set in a finite dimensional 
subspace V over F = R, C. Assume that cf) : C — > R is continuous. Then 

(7.3.2) max0(x) = max My). 

xec ye£(C) 

Assume in addition that <fi is strictly convex. If <j) achieves its maximum at 
x*, then x* is an extreme point of <f>. 

Proof. Assume that max x£ c* 0(x) = <H X *)- Suppose first that x* is 
an extreme point of C. Then (7.3.2) trivially holds. Assume now that x* 
is not an extreme point of C. Theorem 7.1.3 yields that x* = Y^iiLi a i x ii 
where aj € (0, 1), Xj G £(C),i = l, . . . , m and m > 2. The convexity of 
and 7.3.8 yield that 

m 

0(x*) < y^ai</)(xi) < max 0(xi) = 0(x.,-) for some j G (m). 

^ — ' i=i,...,m 
i— l 

Since <f) achieves its maximum at x* we deduce that 0(x*) = <j>(xj). Hence 
(7.3.2) holds. 

Suppose now that <j> is strictly convex. Then Problem lb implies that 
strict inequality holds in the above inequality. Hence 4>(x*) < <K x j)> which 
contradicts the maximality x*. □ 



Theorem 7.3.5 Let x = x n ) T , y — (y 1 , y n ) T G R" and as- 

sume that x -< y. Let </> : — > R &e a convex function. Then 

n n 

(7.3.3) £>(*i) < $>(lfc)- 

i=l i=l 

7/0 is strictly convex on [y n ,yi] and Px ^ y /or a// P G V n then strict 
inequality holds in the above inequality. 

Proof. Define ip : M(y) R by the equality tp((xi, . . . , x n )) — Y^i=i M, x i)- 
Since for any (xi, . . . ,x n ) T G -M(y) Xi G it follows that ip is 

well defined on M(y). Clearly for each P G V n we have the equality 
ip(Py) = ^(y)- (7.2.10) and the convexity of </> yield 

(f>( Xi ) = M. MPvh) < E ap<t>((Py)i)- 

Pev n Pev n 
Sum on i = 1 , . . . , n to deduce that 

V(x)< ^ ap^(Py)- apV'(y) =^(y)- 
Pev n Pev n 
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If <p is strictly convex and x ^ £(A4(y)) then the above inequalities are 
strict. □ 

The following result is well known. (Sec Problems 2-3) 

Theorem 7.3.6 Let C C V be a convex set. Assume that ri C is an 
open set in V and (j) € C 2 (C). Then tfi is convex if and only if the sym- 
metric matrix H((j}) := ( dx ^ is nonnegative definite for each y e C. 

Furthermore, if H((f>) is positive definite for each y e C then <f> is strictly 
convex. 

Let <p : C — ► K be a convex function. Then <p has the following continuity 
and differentiability properties: In the one dimensional case where C = 
(a,b) C R (j) is continuous on C and <f) has a derivative <j>'(x) at all but a 
countable set of points. (j>'(x) is an nondecreasing function (where defined). 
In particular <f> has left and right derivatives at each x, which is given as 
the left and the right limits of <fi'(x) (where defined). 

In the general case C C V, (j> is continuous function in ri C, has a 
differential Dtp in a dense set C\ of ri C, the complement of C\ in ri C has 
a zero measure, and Dcfi is continuous in C\ . Furthermore at each x e ri C 
(j) has a subdiffcrential f e Hom(V,IR) such that 

(7.3.4) 0(y) > 0(x) + /(y - x) for all y e C. 

See for example [Roc70]. 

Definition 7.3.7 Let C C V be a convex set. Then f : C — * R is called 
an affine function if for eac/ix,y e C f(tx+(i — t)y) = tf(x) + (l — t)/(y) 
/or eac/i t€ [0,1] 

Clearly an affine function / on a convex set the functions / and — / are 
convex. Theorem 7.3.4 yield. 

Corollary 7.3.8 Let C C V be a compact convex set. Assume that 
f : C -^>R be an affine function. Then 

max f(x) = max f(y), min f(x) = min f(y). 

Let C C V be a polytope. Then finding the maximum or the minimum of 
of an affine function on C is called the linear programming. It is known that 
the complexity of the linear programming is polynomial in the dimension 
of V, and the complexity of all the half spaces in the characterizing C and 
/ [Kha79, Kar84]. 
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We now give a simple example. Let A = [a,ij] G R" x ™. Denote by S n 
the group of permutations tt : (n) — > (n) . Consider the maximal problem 
to maximize the sum of a generalized diagonal of A: 

n 

(7.3.5) fj,(A) := max ^ a i7r(i) . 

i=l 

Since #S n = n!, a brute force algorithm to try all the permutation will need 
n \ ^ (")™ computations times. (We ignore the complexity of computing 
the sum Ym=i a i-n(i)-) However, fi(A) can be computed polynomially in n. 
Define an affinc / : Cl n — > R by f(X) = tiAX, for any doubly stochastic 
X. It is straightforward to show that 

(7.3.6) fi(A) = max tr AX. 

Since f2„ is a polytope given by at most 4n + n 2 inequalities, and the com- 
plexity of / is n 2 times the complexity of entries, we see that the complexity 
of computing the maximum of f(X) is polynomial in n. 

Definition 7.3.9 Let V l7 V 2 be finite dimensional vector spaces. As- 
sume that C\ C V 1; C 2 C V 2 are convex sets. A functions <j> '■ C\ x C<i — > R 
is called concave-convex if the functions </>(•, y) : C x — > R, 0(x, •) : C 2 — > R 
are concave for each y G C 2 and convex for each xe^ respectively. 

The following result is known as minimax theorem [Roc70, Cor. 37.6.2]. 

Theorem 7.3.10 Let d be a compact convex set in a finite dimen- 
sional vector space Vj for = 1,2. Assume that <f> : C\ x C2 — > R &e a 
continuous concave- convex function. Then 

(7.3.7) min max 0(x,y) = max mm 0(x,y) = 0(x*,y*) 
yec a xec, x£C, yec 3 

/or some x* G C 17 y* G C 2 . 

The point (x*,y*) is called a saddle point. More general types of the min- 
imax theorems are can be found in [Roc70]. 

Problems 

1. Let C C V be a convex set and assume that : C — > R is convex. 
Let Xj,..., x TO G C, to > 3. Show 

(a) Let 01, . . . , a m G [0, 1] and assume that YmLi a i = 1- Then 

m m 

(7.3.8) 0(^ a iXi ) < a^Xi). 

i=l i=l 
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(b) Assume in addition that <f) 1S strictly convex, Xj ^ xj for i ^ j 
and ai, . . . , a m > 0. Then strict inequality holds in (7.3.8). 

2. (a) Let / G C 1 (a, b). Show that / is convex on (a, b) if and only if 
f'(x) is nondecreasing on (a, 6). Show that if f'(x) is increasing on 
(a, 6) then / is strictly convex on (a, 6). 

(b) Let / G C[a,b] n C 1 (a,b). Show that / is convex in [a, b] if and 
only if / is convex in (a, b). Show that if f'(x) is increasing on (a, b) 
then / is strictly convex on [a, b]. 

(c) Let / G C 2 (a, b). Show that / is convex on (a, b) if and only if /" 
is a nonnegative function on (a, b). Show that if f"(x) > for each 
x G (a, 6) then / is strictly convex on (a, b). 

(d) Prove Theorem 7.3.6. 

3. Prove Proposition 7.3.3. 

4. Let d C V, be a compact set in a finite dimensional vector space for 
i = 1,2. Let (j) ■ C\ x C*2 — > K be a continuous function. Show the 
inequality 

(7.3.9) min max ^(x,y) > max min </>(x,y). 

y£C = xGd xGd yGC = 

7.4 Norms over vector spaces 

In this Chapter we assume that F = M, C unless stated otherwise. 

Definition 7.4.1 Let V be a vector space over F. A continuous func- 
tion || • || : V — > [o, oo) is called a norm if the following conditions are 
satisfied: 

1. Positivity: ||v|| = o if and only ifv = 0. 

2. Homogeneity: ||av|| = \a\ ||v|| for each a e F and v e V. 

5. Subadditivity: ||u + v|| < ||u|| + ||v|| /or allu,v G V. 

^ continuous function \\ ■ \\ : V — > [o, oo) w/iic/i satisfies the conditions 2 
and 3 is called a seminorm. The sets 

B| M | := {v G V, ||v|| < i}, Bff. N := {v G V, ||v|| < i}, 

S| M | :={veV, ||v|| = i}, 
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are called the (closed) unit ball, the open unit ball and the unit sphere of 
the norm respectively. For a e V and r > we let 

B| M |(a,r) = {x e V : ||x-a||<r}, Bf H (a, r) = {x e V : ||x - a|| < r} 

be the closed and the open ball of radius r centered at a respectively. If the 
norm || • | is fixed, we use the notation 

B(a,r)=B||.||(a,r), B°(a,r) = Bff. N (a,r). 

See Problem 2 for the properties of unit balls. The standard norms on 
F™ are the l p norms: 

n 

(7.4.1) \\( Xl ,...,x n ) T \\ p =(J2\^\ P )K pe[l,oo), 
||(a;i,...,a;„) T || 00 = max \xi\. 

l<i<n 

See Problem 8. 

Definition 7.4.2 Let V be a finite dimensional vector space over F. 
Denote by V* the set of all linear functional f : V — > F. Assume that || • j 
is a norm on V. The conjugate norm \\ ■ \\ : V* — > F is defined as 

||f||* = max |f(x)|, /or f G V*. 

xeB| M | 

For a norm \\ ■ | on F™ i/ie conjugate norm || • | on F™ is given &y 

(7.4.2) ||x||* = max |y T x| for x e F™. 

yeB||.n 

^ norm || • || on, V is called strictly convex if for any two distinct points 
n,y E S\\.\\ and t G (0,1) the inequality \\tx + (l — t)y|| < l holds. A norm 
|| • || on F™ is called C k , for k £ N, i/ i/ie sphere S\\.\\ is a C' k manifold. \\ ■ \\ 
is called smooth if it is C k for each fceN. 

For x = (x x , . . . ,x n ) T € C" let abs x = (la^l, . . . , |x„|) T . A norm \\ ■ \\ 
on F™ is called absolute if ||x|| = ||absx|| for each x e F". 4 norm ||| • || 
on F" is ca//ed a transform absolute if there exists an absolute norm || • | 
on F™ and P € GL(n,F) such that |||x||| = ||Px|| for each xeF". 

^ norm || • || on F™ is ca//erf symmetric if the function \\(xi, . . . ,x„) T | 
is a symmetric function in xi, . . . , x n . (I.e. for each permutation tt : (n) — > 
(n) and eac/i x = (x l7 . . . , x n ) T e F n equality ||(x w (i), . . . , x T ( n )) T | = 
||(xi, . . . ,x„) T || holds. 
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Theorem 7.4.3 Let || • || be a norm on F". Then the following are 
equivalent. 

1. || • || is an absolute norm. 

2. || • ||* is an absolute norm. 

3. There exists a compact set L C F™ not contained in the hyperplane 
Hi = . . . ,y n ) T e F™, yi = 0} for i = 1, . . . ,n, such that ||x|| = 
max yG L(abs y) T abs x for each x G F™. 

4- ||x|| < ||z|| if abs x < abs z. 

Proof. 1^-2. Assume that x, y e F™. Then there exists z € F™, abs z = 
abs y such that |z T x| = (abs y) T abs x. Since || • || is absolute ||z|| = ||y||. 
Clearly |y T x| < (abs y) T abs x. The characterization (7.4.2) yields that 

(7.4.3) ll x H*= max (abs y) T abs x. 

y6B||.n 

Clearly ||x|| = ||absx||. 

2^>3. The equality (|| • ||*)* = || • ||. see Problem 3, and the equality 
(7.4.3) implies 3 with L = Bp. p.. Clearly contains a vector whose all 

coordinates are different from zero. 

3^>4- Assume that abs x < abs z. Then (abs y) T abs x < (abs y) T abs z 
for any y e F". In view of the characterization of the absolute norm given 
in 3 we deduce 4- 

4^>1. Assume that abs x = abs y. Since abs x < abs y we deduce that 
Il x ll < ||y||- Similarly ||x|| > |jy||. Hence 1 holds. □ 



Definition 7.4.4 A set L C F™ is called symmetric if for each y = 
(y l7 . . . ,2/ n ) T in L the vector (j/ T (i), . . . ,y7r(n)) T * s i n L, for each permuta- 
tion it : (n) — > (n). 

Corollary 7.4.5 Let || • | be a norm on F™. Then the following are 
equivalent. 

1. || • || is an absolute symmetric norm. 

2. || • ||* is an absolute symmetric norm. 

3. There exists a compact symmetric set LcF", not contained in the 
hyperplane H { = {(y 1 , y n ) T e F™, y t = 0} for i = 1, . . . , n, such 
that ||x|| = max yei (abs y) T abs x for each x e F™. 
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See Problem 6. 

Proposition 7.4.6 Assume that || • | is a symmetric absolute norm on 
R 2 . Then 



(7.4.4) ||x||* < ||x|r||x|| < HxlUW, for any x G R 2 . 

Proof. The lower bound follows from the Problem 9. We claim that 
for any two points on x,y satisfying the condition ||x|| = ||y|| = l the 
following inequality holds 

(7-4.5) (||x|| 1 -||y|| 1 )(||x|| oo -||y|| oo )<0. 

Since || • || is symmetric and absolute it is enough to prove the above inequal- 
ity in the case that x = (x l7 x 2 ) T , x x > x 2 > o, y — (y l7 y 2 ) T ,Vi > y 2 > o. 
View By. 1 1 as a convex balanced set in R 2 , which is symmetric with re- 
spect to the line z\ = z 2 . The symmetricity of || • || implies that all 
the points (zi,z 2 ) T G By.y satisfy the inequality z\ + z 2 < 2c, where 
|j(c,c) T || = l,c > 0. Let C,D be the intersection of By.y , Cy.y the with 
octant K = {(zi,z 2 ) T € R 2 , z\ > z 2 > 0} respectively. Observe that 
the line z\ + z 2 = 2c may intersect D at an interval. However the line 
Z\ + z 2 = 2t will intersect D at a unique point (zi(t), z 2 (t)) T for t € [b, c), 
where \\(2b, 0)|| = 1,6 > 0. Furthermore Zi(t), — z 2 (t) are decreasing in 
(6, c). Hence, if x\ + x 2 > yi + y 2 it follows that yi > x\. Similarly, 
x\ + x 2 < yi + y 2 it follows that y\ <x\. This proves (7.4.5). 

To show the right-hand side of (7.4.4) we may assume that ||x|| = l. So 
ll x !l* = |y Tx l f° r some y G Sy.y. Hence ||x|| |Jx||* = |y T x|. Clearly 

|y T x| < mindlxlUllylU.llxlUllylU). 

Suppose that ||y||i < ||x|| ^ . Then the right-hand side of (7.4.4) follows. 
Assume that |yU > Hx^. Then Then (7.4.5) yields that HyH^ < HxH^ 
and the right-hand side of (7.4.4) follows. □ 

A norm || • || : F mx ™ — > M. + is called a matrix norm. A standard example 
of matrix norm is the Frobenius norm of A = [a,j] G F mxn : 



'.4 ■ 



m.n 

V U,,|2 
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Recall that \\A\\ F = (Y™ = i ^(A) 2 )*, where <n(A), i = 1, . . . , are the singu- 
lar values of A. More generally, for each q e [l,oo] 

(7-4.7) Pll,,s:=(5>(^)')* 

i=i 

is a norm on F mxrl , which is called the g-Schatten norm of A. Furthermore, 
for any integer p £ [l,m] and w\ > . . . > w p > 0, the function f(A) given 
in Corollary 4.10.5 is a norm on F mx ". See Problem 4.10.4. We denote 
|| • ||oo,s = °i(") as the || • || 2 operator norm: 

(7.4.8) \\A\\ 2 . 2 := tri(A) for A e C mx ™. 

See §7.7. 

Definition 7.4.7 ^4 norm || • || on C mx ™ is called a unitary invariant 
if \\UAV\\ = \\A\\ for any A G C mxn and unitary U G U(m), V G V(n). 

Clearly, any p-Schatten norm on C mx " is unitary invariant. 

Theorem 7.4.8 For positive integers m, n let I = min(m, n). For A G 
C mxn let <r(A) := (ai(A), . . . , ai(A)) T . Then \\ ■ \\ is a unitary invariant 
norm on C mXTl if and only if there exists an absolute symmetric norm 1 1| • 1 1| 
onC 1 suchthat \\A\\ = \\\cr(A)\\\ for any AeC mxn . 

Proof. Let D(m, n) C C mx ™ be the subspace of diagonal matrices. 
Clearly, D(m, n) is isomorphic to C l . Each D G D(m,n) is of the form 
diag(x), x = (j„ . . . , x;) T , where x\, . . . , xi are the diagonal entries of D. 
Assume that || • || is a norm on C mx ". Then the restriction of || • || to D(m, n) 
induces a norm ||| • ||| on C' given by |||x||| := || diag(x)||. Assume now that 
|| • || is a unitary invariant norm. For a given x G C , there exists a diagonal 
unitary matrix such that [/diag(x) = diag(abs x). Hence 

|||x||| = || diag(x)|| = ||f7diag(x)|| = || diag(abs x)|| = |||abs x|||. 

Let 7r : (/) — > (I) be a permutation. Denote x, := (i,^, . . . , x n ^) T . 
Clearly there exists two permutation matrices P G U(m),Q G U(n) such 
that diag(x T ) = [/diag(x)F. Hence |||x w ||| = |||x|||, and ||| • ||| is absolute 
symmetric. Clearly, there exists unitary U, V such that A = U dia,g(cr(A))V . 
Hence \\A\\ = \\\v(A)\\\. 

Assume now that ||| • ||| is an absolute symmetric norm on C l . Set 
|| A|| = |||er(A)||| for any A. Clearly || • || : C mxn -> R + is a continuous 
function, which satisfies the properties 1-2 of Definition 7.4.1. it is left to 
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show that || • || satisfies the triangle inequality. Since ||| • ||| is an absolute 
symmetric and <Ji{A) > . . . > <Ji(A) > 0, Corollary 7.4.5 yields that 

(7.4.9) |L4||= max (abs y) T cr(A) 

y=(y 1 ,.--^i) T ei,|yil>.-->|yil 

for a corresponding a compact symmetric set L C C™, not contained in 
the hyperplane Hi — {(yi, ■ ■ ■ , y n ) T <= F", yi = 0} for i = 1, . . . , n. Use 
Problem 4.10.6b to deduce from (7.4.9) that \\A + B\\ < \\A\\ + \\B\\. □ 



Definition 7.4.9 A norm on \\ ■ \\ on F nxn is called a spectral dominant 
norm if \\A\\ is not less than p(A), the spectral radius of A, for every A £ 

Since cri(A) > p(A), see (4.10.14) for k = 1, we deduce that any q- 
Schatten norm is spectral dominant. 



Problems 

1. Let V be a finite dimensional vector space over F. Show that a 
seminorm || • || : V — > E + is a convex function. 

2. Let V be a finite dimensional vector space over F. X C V is called 
balanced if tX — X for every t £ F such that |i| = 1. Identify V with 
pdim T nen t ne topology on V is the topology induced by open sets 
in F™. Assume that || • || is a norm on V. Show 

(a) B||.|| is convex and compact. 

(b) B||.|| is balanced. 

(c) is an interior point of B||.||. 

3. Let V be a finite dimensional vector space over F. Let X C V be a 
compact convex set balanced set such is its interior point. For each 
x £ V\{0} let /(x) = min{r > o : ^x £ X}. Set /(0) = o. Show 
that / is a norm on V whose unit ball is X. 

4. Let V be a finite dimensional vector space over F with a norm || • ||. 
Show 

(a) ||f||* = max yeS|M| |f(x)| for any f £ V*. 

(b) Show that for any x £ V and f £ V* the inequality |f(x)| < 
l|f||*l|x||. 
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(c) Identify (V*)* with V, i.e. any linear functional on A : V* — ► F 
is of the form A (f) = /(x) for some x G V. Then (||x||*)* = ||x||. 

(d) Let L C V* be a compact set which contains a basis of V*. 
Define ||x||i = maxfg^ |f (x)|. Then ||x||x, is a norm on V. 

(e) Show that || • || = || • \\l for a corresponding compact set LcV*. 
Give a simple choice of L. 

5. Let V be a finite dimensional vector space over F, dim V > l, with 
a norm || • ||. Show 

(a) £(B| M |)CS||.||. 

(b) £(B| M |) = S| M | if and only if for any x ^ y G S||. || (x,y)cBj. r 

(c) For each x G S||.|| there exists f G S||.||» such that 1 = f(x) > 
|f(y)| for any y G B| M |. 

(d) Each f G S||.||. is a proper supporting hyperplane of B||.|| at some 
point x G S||.|| . 

6. Prove Corollary 7.4.5. 

7. Let V be a finite dimensional vector space over F. Assume that 
|| • ||i, || • || 2 arc two norms on V. Show 

(a) Hxlli < |jx|] 2 for all x G V if and only if B Hl D B| M | 2 . 

(b) Show that there exists C > c > such that c\\x.\\ x < ||x|| 2 < 
C||x|U for all xe V. 

8. For any p£ [1, oo] define the conjugate p* = q £ [1, oo] to satisfy the 
equality ^ + ^ = 1. Show 

(a) Holder's inequality: |y*x| < (abs y) T abs x < ||x||p||y|| p . for any 
x,y e C™\{0} and p G [l,oo]. (For p — 2 this inequality is 
called the Cauchy-Schwarz inequality.) Furthermore, equalities 
hold in all inequalities if and only if y — ax for some a € C\{0}. 
(Prove Holder's inequality for x, y G R™.) 

(b) ||x|| p is a norm on C" for p G [1, oo]. 

(c) ||x|| p is strictly convex if and only if p G (1, oo). 

(d) Jbrpe(l,oo) 5(B| M | p ) = S||.|| p . 

(e) Characterize f (B||.|| ) for p = 1, oo for F = R, C. 

(f) For each x G C™ the function ||x|| p is a nonincreasing function 
for p G [1, oo]. 
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9. Show that for any norm || • || on F™ the inequality 

||x||= < min(||x||*||x||, ||x||*||x||) for any x e F™. 

In particular, if || • || is absolute then Hx^ < ||x|j*||x||. (Hint: use 
the equality ||x|| 2 = x T x.) 

10. Let LcF" satisfy the assumptions of condition 3 of Theorem 7.4.3. 
Let j/(x) = max y£ L(abs y) T abs x for each x e F™. Show that ^(x) 
is an absolute norm on F™. 

11. Let || • || be an absolute norm on W 1 . Show that it extends in a unique 
way to an absolute norm on C™. 

12. Let V be a finite dimensional vector space over F = R, C. 

(a) Assume that || • || is a scminorm on V. Let W := {x e V, ||x|| = 
o}. Show that W is a subspace of V, and for each x e V the 
function |j • || is a constant function on x + W. 

(b) Let W be defined as above. Let U be the quotient space V/W. 
So v e V is viewed as any y e v + V for a corresponding v e V 
Define the function ||| ■ ||| : V -» K+ by |||v||| = ||y||. Show that 
HI • HI is a norm on V. 

(c) Let XJ 1 ,XJ 2 are finite dimensional vector spaces over F. Assume 
that HI • HI : Uj -> R+ is a norm. Let V = \J 1 © U 2 . Define 
||u x u 2 || = IllUilH for each e Uj, i = 1,2. Show that || • || is 
a seminorm on V. Furthermore, the subspace © U 2 is the set 
where || • || vanishes. 

13. Show that for any A e C mx ™ ||A|| 2 ,2 = a {A) = max|| x | U=1 ||Ax|| 2 . 
(Hint: Observe that ||Ax||= =x*(A*A)x.) 

14. For F = R, C, identify (F mx ™)* with F mx " by letting (j) A : C mxn -» F 
be tr(A T X) for any A e F mx ™. Show that for any p e [l,oo] the 
conjugate of the p-Schatten norm || • || Pj 5 is the g-Schattcn norm on 
F mx ", where i + i = 1. 

7.5 Numerical ranges and radii 

Let S 2 "" 1 := {x e C™, x*x = 1} be the unit sphere of the £ 2 norm on C". 

Definition 7.5.1 A map <j) from S 2n_1 to 2 C ", the set of all subsets of 
C n , is called a i^-map, if the following conditions hold. 
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1. For each x G S 2n_1 the set 4>(x) is a nonempty compact set. 

2. The set U xeS 2n-i0(x) is compact. 

3. Let x fe G S 2n_1 ,yfe G 4>(x k ) for k G N. Assume that lim^oo x^, = x 
and linifc^ooyfe = y. (Note that x G S 2n_1 J Then y G </>(x). 

^. y T x = l for each x G S 2n_1 and y G 4>(x). 

j4sswme i/iai </> from S n_1 £o 2 C " is v-map. Then for A G C™ xrl 

(7.5.1) uj^A) := U xeS 2n-i U ye0(x) {y T Ax}, 

(7.5.2) r (A) = max |y T Ax| 

xe0(x),ye0(x) 

are called the <f>-numerical range and the (^-numerical radius respectively. 
It is straightforward to show that is a seminorm on C" x ™, see Problem 

1. 

Lemma 7.5.2 Let <j) : S 11 " 1 -» 2 C " &e a v-map. Then spec (A), £/ie 
spectrum of A, is contained in the <p-numerical range of A. In particular 
r^A)>p(A). 

Proof. Let A G C nxn and and assume that A is an eigenvalue of A. 
Then there exists an eigenvector x G S 2n_1 such that Ax = Ax. Choose 
y G <£(x). Then y T Ax = Ay T x = A. Hence A G lo^A). Thus r (A) > |A|. 

□ 



Lemma 7.5.3 Let || • || : C nx ™ — > R + 6e a seminorm, which is spectral 
dominant, i.e. \\A\\ > p(A) for any A G C" x ". Then \\ ■ | is a norm on 

Proof. Assume to the contrary that || • || is not a norm. Hence there 
exists ^ A e C nxn such that ||A|| = 0. Since = p|| > p(A) we 
deduce that A is a nonzero nilpotent matrix. Hence, T~ 1 AT — ©f =1 Jj, 
where each Ji a nilpotent Jordan block and T G GL(n, C). Since A ^ 
we may assume that J\ G C ix ', has an upper diagonal equal to 1, all other 
entries equal to and I > 2. Let B ~ ©f =1 -Bi where each Bi has the same 
dimensions as Ji. Assume that Bi are zero matrices for i > 1, if k > 1. Let 
B\ = [bij,i] G C x ', where 6 2 i,i = 1 and all other entries of B\ equal to 0. It 
is straightforward to show that the matrix ©f =1 (Ji + tBA has two nonzero 
eigenvalues ±V* for t > 0. Let C := T{@ 1 l =l B i )T- 1 . Then p(A + tB) = Vi 
for i > 0. Hence for t > we obtain the inequalities 



P (A + tB) = V~t< \\A + tB\\ < \\A\\ + \\tB\\ = t\\B\\ => \\B\\ > 
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The above inequality cannot hold for an arbitrary small positive t. This 
contradiction implies the lemma. □ 

Use the above Lemmas and Problem 1 to deduce. 

Theorem 7.5.4 Let <j) : S n_1 — > 2 C " be a v-map. Then r^(-) is a 
spectral dominant norm on C nxn . 

We now consider a few examples of ^-maps. 

Example 7.5.5 The function (j> 2 ■ S 2 "" 1 — » 2 C " given by <fo(x) := {*} 
is a v-map. The corresponding numerical range and numerical radius of 
A G C" x ™ are given by 

u) 2 (A) = {z = x*Ax, for all xeC" satisfying x*x = i}cC, 

r2(A) := max |x*Ax|. 

x^C Tl ,x*x— l 

-ft is called the classical numerical range and numerical radius of A, or 
simply the numerical range and numerical radius of A. 

More general 

Example 7.5.6 For p G (l,oo) the function 4> p : S 2n_1 — > 2 C given 
&2/ P ((zi,...,a; n ) T ) : = {|| x llp P (ki| p ~ 2 ^i ; • • ■ . \x n \ p ' 2 x n ) T } is a v-map. 
The corresponding numerical range and numerical radius of A G C™ xrl are 
denoted by u) p (A) and r p (A) respectively. 

The most general example related to a norm on C™ is as follows. 

Example 7.5.7 Let \\ ■ \\ be a norm on C™. For each x G S 2n_1 let 
(/>||.||(x) be the set of all y G C" wft/i i/ie dtta/ norm ||y||* = ^ satisfying 

y T x = l. T/ien, 0||.|| is a v-map. (See Problem 6.) The corresponding nu- 
merical range uj\\.\\{A) and the numerical radius r||.jj(A) is called the Bauer 
numerical range and the Bauer numerical radius respectively of A G C nx ™. 

Definition 7.5.8 A norm \\ ■ \\ on C nxn is called stable if there exists 
K > such that \\A m \\ < K\\A\\ m for all A G C" x ™. 

Clearly, || • || is stable if and only if the unit ball B||.|| C C" x ™ is power 
bounded, see Definition 3.4.5. 

Theorem 7.5.9 Let <j) : S 2n_1 — > 2 C " be a v-map. Set c := max[/ e u(n) r <p 
Then 

(7.5.3) \\(zl - A)- 1 ^ < for all \z\ > 1, r^A) < 1. 

|z| - 1 

In particular, a (^-numerical radius is a stable norm. 
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Proof. Fix x £ S 2n 1 We first note that ||y|| 2 < c for each y £ </>(x). 
Let z = j^y G S 2n_1 . Then there exists U G U(n) such that Ux = 

z. Hence ||y|| 2 = y T Ux < r^,(U) < c. Assume next that r$(A) < 1. 
Hence p(A) < r^A) < 1. So (zl - A)- 1 is defined for \z\ > 1. Let 
v := (z/ — A) _1 x, V; := ]p^ v - Then for y £ ^(Vi) we have 

ig^ = |y T (z/ - A) Vl | = \z- y T A Vl | > |z| - i. 
On the other hand 

|y T x| < ||y|| 2 ||x|| 2 = ||y|| 2 < c. 

Combine the above inequalities to deduce \\(zl — A)x\\ 2 < ^ z f_ i for all 
j|x|| 2 = l. Use Problem 7.4.13 to deduce (7.5.3). Theorem 3.4.9 yields that 
the unit ball corresponding to the norm r^(-) is a power bounded set, i.e. 
the norm r^(-) is stable. □ 



Theorem 7.5.10 Let || • || be a norm on C nxn . Then || • || is stable if 
and only if it is spectral dominant. 

Proof. Assume first that || • || is stable. So £?||.|| is a power bounded 
set. Theorem 3.3.2 yields that each A £ Bu.n satisfies p(A) < 1. So if 
A ^ we get that p{j^\A) < 1, i.e. p(A) < \\A\\ for any A ^ 0. Clearly 
p(0) = ||0|| = 0. Hence a stable norm is spectral dominant. 

Assume now that || • || is a spectral dominant norm on C™ x ". Recall that 
B||.|| is a convex compact balanced set, and is an interior point. Define a 
new set 

A := {B e C nxn , B = (i-a)A+zI, a £ [o,i], z £ C, \z\ < a, A £ B H }. 

It is straightforward to show that A is a convex compact balanced set. Note 
that by choosing a — 1 we deduce that I £ A. Furthermore, by choosing 
a = we deduce that B||.|| C A. So is an interior point of A. Problem 7.4.3 
yields that there exists a norm ||| • ||| on C nxn such that B|||.||| — A. Since 
S| M | C B|| M || it follows | P| || < || A|| for each A £ C nxn . We claim that ||-|| 
is spectral dominant. Assume that |||5||| = 1. So B = (1 — a) A + zl for 
some a £ [0, 1], z £ C, \z\ < a and A £ B||.|| . Since || • || is spectral dominant 
it follows that p(A) < \\A\\ < 1. Note that spec (B) = (1 - a)spec (A) + z. 
Hence p(B) < (1 — a)p(A) + \z\ < (1 — a) + a = 1. So ||| • ||| is spectral 
dominant. Since |||/|||| < 1 and |||/||| > p(I) = 1 we deduce that |||/||| = 1. 
Hence, for any z £ C, \z\ < 1 we have |||z/||| < 1. 
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For x e S 2 "- 1 let 

C(x) = {ueC, u = Bx, |||.B||| < 1}. 

Clearly C(x) is a convex set in C n . Since for each |||£?||| < 1 we have that 
p(B) < 1 it follows that x ^ C(x). The hyperplane separation theorem 
Theorem 7.1.6 implies the existence of y G C" such that 

(7.5.4) 3?(y T x) > 3?(y T Bx) for all |||B||| < i. 

Substitute in the above inequality B — zl, \z\ < 1 we deduce that 3?(y T x) > 
3t(zy T x). By choosing an appropriate argument of z we deduce 3?(y T x) > 
|z||y T x|. Hence 3?(y T x) > |y T x|. In view of the strict inequality in (7.5.4) 
we deduce that y T x is real and positive. Thus we can renormalize y so 
that y T x. Let 0(x) be the set of all w e C™ such that 

w T x = l, max |w T £>x| = l. 

|||B|||<i 

Clearly, y G </>(x). It is straightforward to show that <j> : S 2n_1 — > 2 C " is a 
i/-map. 

As w T i?x = tr_B(xw T ) we deduce that |||xw T |||* = l, where ||| • |||* is 
the dual norm of ||| • ||| on C" x ": 

(7.5.5) |||C|ir= max \trBC\= max \tv BC\. 

BeB|||. M | BeS| M .||| 

Let 1Z(l,n, n) c C" x " be the variety of all matrices of rank one at most. 
Clearly, TZ(l,n, n) is a closed set consisting of all matrices of rank one and 
nxn - Hence 72.(1, n,n) n Sim. mi. is a compact set consisting of all xw T , 
where x <G S 2 "^ 1 and w e ^(x). Since (||| • |||*)* = ||| • ||| it follows that 

rAB) = max |w T £>x| = max |tri?(xw T )|< 

xes 2n -\we0(x) xes^-^we^fx) 

max |trSq = |||B|||<||S||. 

ceb|||.|||. 

Hence B||.|| C B|||.||| C B r (.). Theorem 7.5.9 yields that r^(-) is a stable 
norm. Hence || • || and ||| • ||| are stable norms. □ 
Use Theorem 7.5.10 and Problem 3 to deduce. 

Corollary 7.5.11 Let A C C™ xrl be a compact, convex, balanced set, 
whose interior contains 0. Then A is stable if and only p(A) < 1 for each 
Ae A. 

Definition 7.5.12 Let ¥ be field. A subspace ^ U C F" x ™ is called 
stable if there exists an integer k e [l,n] such that the dimension of the 
subspace Ux C F™ is k for any ^ x e F". U is called maximally stable 
if k = n. 
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The following result is a generalization of Theorem 7.5.10. 



Theorem 7.5.13 Let A C C nxn be a compact convex balanced set, (see 
Problem 7.4-2 for the definition of a convex balanced set), which contains 
the identity matrix. Assume that C := span A is a stable subspace. Then 
A is a stable set if and only if p(A) < 1 for each A £ A. 



Proof. Clearly, if A is stable, then each A £ A is power bounded, 
hence p{A) < 1. Assume now that A is a compact convex balanced set 
containing identity such that £ is a stable subspace. Let x £ S 2n_1 and 
consider the subspace Cx of dimension k. Since A is a compact convex 
balanced set it follows that Ax is a compact convex balanced set in C since 
span Ax — Cx it follows that ri Ax. Hence Ax is a unit ball of the norm 
|| • || x on the subspace Cx. Since I £ A it follows that £ £ Ax. We claim 
that |jx|| x — l. Assume to the contrary ||x|| x < l. Then (1 + e)x £ Ax 
for some e > 0. So there exists A £ A such that Ax = (l + e)x. Hence 
p(A) > (1 + e) contrary to our assumptions. 

Identify (Cx)* with Cx. So a linear functional / : Cx — ► C is given by 
/(y) = z T y for some z £ Cx. Let || • ||* be the conjugate norm on Cx. 
Denote by B(x) C Cx the unit ball of the norm || • ||*. Since ||x||* = l it 
follows that there exists z(x) £ Cx such that z(x) T x = l and ||z(x)||* = i. 
We claim that U xe s 2 -- 1 '8( x ) is a compact set in C n . 

Indeed, since Cx has a fixed dimension k for each for each x £ C™ we 
can view Cx of the form [/(x)W for some fixed subspace W C C" of di- 
mension k and a unitary matrix U(x). (U(x) maps an orthonormal basis 
of W to an orthonormal basis of Cx.) Hence the set C(x) :~ U*(x)Ax is 
a compact convex balanced set in W, with an interior point. Since A is 
compact, it follows that C(x) varies continuously on x £ S 211 " 1 . Therefore 
the set T>(x) := U*B(x) C W varies continuously with x £ S 2n_1 . Hence 
U xeS 2„-i2?(x) is a compact set in W, which yields that U xeS 2n-iS(x) is 
a compact set in C". In particular, there exists a constant K such that 
Il z ( x )||2 < K. We now claim that A satisfies the condition 3.4.13 of Theo- 
rem 3.4.9. Indeed, for x £ S 2nl ~\ A £ A we have that Ax £ A(x), hence 
||^ x llx < !• Hence for |A| > 1 we have 



||(A7-A)x|| 2 > 



|A| 



i|z(x) T (A/- A)x|| a _ |A-z(x) T Ax| 



K 



K 



> 



|A| - |z(x) T Ax| > |A| - |z(x) T Ax| > 



K 

l^x|| x ||z(x)||; 

K 



> 



1A|-1 
K 



K 
|A| 



K 
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Thus for each |A| > 1 and O^xeCwe have the inequality ||(A7— A)x|| 2 > 
^j^T 1 ||x|| 2 . Choose x = (XI — A)~ 1 y to deduce the inequality 

\\(\I-A)y\\ 2 K 

llylU — |A| i - 

So <Ji(\I — A)~ x ) < \ x f-i ■ Hence A satisfies the condition 3.4.13 of Theo- 
rem 3.4.9 with the norm cti(-). Theorem 3.4.9 yields that A is stable. □ 

Problem 7 shows that in Theorem 7.5.13 the assumption that £ is a stable 
subspace can not be dropped. In Chapter ? [Fri84] we show the following 
result. 

Theorem 7.5.14 Let n > 2, d e [2n— 1, n 2 ] be integers. Then a generic 
subspace C ofC nxn of dimension d is maximally stable. 



Problems 

1. Let (p : S 211 - 1 -» 2 C " be a i^-map. Show that r : C" x ™ -» R+ is a 
seminorm. 

2. Let (j) : S 211 " 1 -> 2 C " be a i/-map. Show that r (7„) = 1. 

3. Show that for any p e (1, oo) the map 4> p given in Example 7.5.6 is a 
j>-map. 

4. Show 

(a) For any unitary U e C nxn and A e C nx " u 2 (t/*Af7) = w 2 (A) 
and r 2 ((/M(/) = r 2 (A). 

(b) For a normal A e C" x ™ the numerical range w 2 (A) is a convex 
hull of the eigenvalues of A. In particular r 2 (A) = p(A) for a 
normal A. 

(c) w 2 (-A) is a convex set for any A e C" x ™. (Observe that it is 
enough to prove this claim only for n = 2.) 

5. Let A = J4(0) € (£4x4 kg a n jip t en t Jordan block of order 4. 
Show that r 2 (A) < l,r 2 (A 2 ) = \,r 2 (A 3 ) = \. Hence the inequal- 
ity r 2 (A 3 ) < r 2 (A)r 2 (A 2 ) docs not hold in general. 

6. Show that the map 0||.|| : S 2n_1 — > 2 C ° given in Example 7.5.7 is a 
i>-map. 
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7. Let || -|| be the norm on C" x ™ given by || [ajj]" =j= i|| := max lje[li „] \aij\. 
Denote by U n C C nx " the subspace of upper triangular matrices. For 
n > 2 show. 

(a) U n is not a stable subspace of C™ x ". 

(b) U n n B||.|| is a compact, convex, balanced set. 

(c) p(A) < 1 for each AeU n C\ B\\.\\. 

(d) U n n B||.|| is not a stable set. 

7.6 Superstable norms 

Definition 7.6.1 A norm \\ ■ \\ on C nxn is called superstable if \\A k \\ < 
\\A\\ k for k = 2, . . . , and each A e C nxn . 

Clearly, any operator norm on C™ xrl is a superstable norm, see §7.7. 

Theorem 7.6.2 The standard numerical radius r 2 (A) = max xeC n ,|| x || 3 =i |x*Ax| 
is a superstable norm. 

To prove the theorem we need the following lemma. 

Lemma 7.6.3 Assume that A e C nxn ,p(A) < 1 and x e S 2 "" 1 . Let 
Zj = e and assume that I n — ZjA € GL(n, C) for j = 1, . . . , m. Then 



(7.6.1) 1 - x*A™x = £ || Xj ||»(i - z,y*A yj ), 

i=i 

wherexj = ( J| (l - z fc A))x, y 3 = * , j = i,...,m. 
ke{m)\{j} ll x jll x j 

Proof. Observe the following two polynomial identities in z variable 

l-z m = ]](l-z k z), l=-£ n 

fc=i J= 1 fee{m)\{j} 

Replace the variable z by A obtain the identities 

(7.6.2) J„ - A m = Y[ {In - ZkA) , In=-J2 II 

fe=l m 3 = 1 ke(m)\{j} 

Multiply the second identity from the right by x to get the identity x = 
^ X)j=i x j- Multiply the first identity by x from the right respectively to 
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obtain x — A m x = dlfcLiC^n — ZkA))x. Since 7„ — z^A for k = 1, . . . , m 
commute, we deduce that for each fc, x — A" l x = (7„ — z/ £J 4)x / ! i ;. Hence 

m 

l-x* J 4 m x = x , (x-i m x) = — Vx*(x- A m x) = 
- ^x*(/„ - z,A)x, = i- ^ || Xj -||»(i - zj-y^). 

□ 



Proof of Theorem 7.6.2. From the the proof of Lemma 7.6.3 it 
follows that (7.6.1) holds for any A G C" xn and some y l7 . . . ,y m e S 2n ~\ 
since for Xj = o in (7.6.1) we can choose any yj G S 2n_1 . Suppose that 
that r 2 (A) = 1. Let ( G C, |C| = 1. Apply the equality (7.6.1) to <^4 and 
x e S 2n_1 to deduce 

m 

1 - C m xM™x = ^ ]T IKII^i - z,Cw^ Wj ) 

for corresponding w„ . . . , w m G S 2n_1 . Choose ( such that £ m x*A m x = 
|x*A m x|. Since r 2 (z k (A) = r 2 (A) = 1, it follows that 3?(1 - z^Cw*^) > 
o. Hence, the above displayed equality yields 1 — |x*A m x| > o, i.e 1 < 
|x*A m x|. Since x G S 2 "" 1 is arbitrary, it follows that r 2 (A m ) < 1 if 
r 2 (A) = 1. Hence r 2 {-) is a superstable norm. □ 

Definition 7.6.4 For an integer n > 2 and p G [l,oo] Zei if Pi „ > 1 be 
the smallest constant satisfying r p (A m ) < K v _ n r v {A) m for all A G C nx ". 

Theorem 7.6.2 is equivalent to the equality K 2 ^ n = 1. Problem lb shows 
that Ki >n = Koo.n = 1. It is an open problem if sup„ 6N max p£oo [ l oo ] K p , n < 
oo. 

Theorem 7.6.5 Let || • || be a norm on C nxn which is invariant under 
the similarity by unitary matrices, i.e. \\UAU~ 1 \\ = \\A\\ for each A G C" x ™ 
and U G U(n). Then \\ ■ \\ is spectral dominant if and only if \\A\\ > r 2 (A) 
for any A G C nxn . 

To prove the theorem we bring the following two lemmas which arc of 
independent interest. 
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Lemma 7.6.6 Let || • || be a norm on C nxn . Assume that \\ ■ || is in- 
variant under the similarity by U G GL(n,C), i.e. \\A\\ — ||C/A[/ _1 || for 
each A G C" x ™. Then U is similar to diagonal matrix A: 

(7.6.3) A = diag(A 1 ,...,A„), |Ai| = . . . = |A n | > 0. 

Proof. Let A,/ieCbe two distinct eigenvalues of U. So there are two 
corresponding nonzero vectors x, y G C" such that Ux = Xx,U T y = /iy. 
For A = xy T we deduce that UAIJ- 1 = $A. Since ||A|| = WUAU^W > 
it follows that |A| = |/u|. Hence all the eigenvalues of U have the same 
modulus. 

it is left to show that U is diagonable. Assume to the contrary that 
U is not diagonable. Then there exists an invertiblc matrix T and upper 
triangular matrix V — [vij]" = j =1 such that Vu — V22 = A ^ 0,i>i2 = 1, and 
V = TVT- 1 . Choose A = TBT^ 1 , where B = [6ij]? =j=1 , where b 22 = j2- 
Since \\U k AIJ- k \\ = \\A\\ for k =G N it follows that the sequence of matrices 
U k AU~ k ,k G N is bounded. A straightforward calculation shows that the 
(1,2) entry of T- 1 (U k AlJ- k )T is k 2 . Hence the sequence U k AU~ k , k G N 
is not bounded, contrary to our previous claim. The above contradiction 
establishes lemma. □ 



Lemma 7.6.7 Let A = diag(Ai, . . . , A„) G C™ xn and assume that |Ai| = 
. . . = |A„| > and \ ^ Xj for i 7^ j. Suppose that \\ ■ \\ is a norm on C nxn 
which is invariant under the similarity by A. Then 

(7.6.4) ||diag(A)||<||A||. 

Proof. A-similarity invariance implies 

1 m 1 m 

II VA fc AA- fe || < V ||A fe AA- fe || = \\A\\. 

fc=0 fe=0 



For A = [aij] G C nxn let 



[aij,m] = — V A fc ylA- fe , 

m+l ^ 



k=0 



1 _ 

A m+1 

where a ii%m = an, a ll . m = a ir —f 5— for i ^ j. 

(m + l)(l- 

Hence lin^^oo A m = diag(A). Since \\A m \\ < \\A\\ we deduce the inequal- 
ity (7.6.4). □ 
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Proof of Theorem 7.6.5. Assume first that ||A|| > ^(A) for each 
A E C" x ™. Clearly, || • || is spectral dominant. Assume now that || • | 
is invariant under similarity by any unitary matrix U, and || • || is spectral 
dominant. Since || • || is invariant under the similarity by a diagonal matrix 
A = diag(Ai, . . . , A„), where |Ai| = . . . = |A„| and A^ ^ Xj for i ^ j, Lemma 
7.6.7 yields (7.6.4). Let A — [a^]. Since diag(A) = diag(an, . . . , a nn ) and 
|| • || is spectral dominant we obtain that 

\\A\\ > ||diag(A)|| > p(diag(A)) - max \a u \. 

Let V G U(n). Then the first column of V is x e S 2n_1 . Furthermore, 
the (1, 1) entry of V* AV is x*Ax. Since || • || is invariant under unitary 
similarity we obtain \\A\\ = \\V* AV\\ > \x*Ax\. As fov a,ny x e S 2n_1 there 
exists a unitary V with the first column x we deduce that \\A\\ > r 2 (A). □ 



Problems 

1. (a) Describe the ^-maps 011.1^,(^11. || oo . 

(b) Show that for for p = 1, oo r p (A) is equal to the operator norm 
of || ^4 ||p, for A e C nxn viewed as a linear operator A : C n — > 
C™. (See §7.7). Hence K\^ n — = 1, where K p ^ n is given 
Definition 7.6.4. 

(c) Show that for each p e (l,oo) and integer n > 2 there exists 
A e C" x " such that r p (A) < \\A\\ p , where ||A|| p the operator 
norm of A. 



7.7 Operator norms 

Let V a , Vfc be two finite dimensional vector spaces over F = R, C. Assume 
that || • 1 1 a , || • ||fc arc norms on V , V& respectively. Let T : V a — > be a 
linear transformation. Then 

(7.7.1) \\T\\ afi := max M^, 

0#xGV a ||x|| a 

is called the operator norm of T. Clearly 

(7.7.2) ||T|| , 6 = max ||Tx|| 6 = max ||Tx|| 6 

x „<i x L=i 
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See Problem 1. Let V c be a third finite dimensional vector space over F 
with a norm || • || c . Assume that Q : Vf, — > V c is a linear transformation. 
The we have the well known inequality 

(7-7.3) \\QT\\ a ,c < ||Qlkc||T||„,6. 

See Problem 2. 

Assume now that V a = V b = V and || • j| a = || • \\b = || ■ || c = II ' II- We 
then denote ||T|| := ||T|| , 6 and ||Q|| := ||Q|| 6 , C . Let Id : V V be the 
identity operator. Hence 

(7.7.4) ||Id|| = l, \\QT\\ < \\Q\\ \\T\\, \\T m \\<\\T\rform = 2,.... 

Assume that V a = F",V b = F m . Then T : F™ -> F m is represented by a 
matrix A £ F mxn . Thus ||^4|| a ,6 is the operator norm of A. For m = n and 
II ' lU = II • lib = II ' II we denote by ||A|| the operator norm. Assume that 
s, t £ [1, oo]. Then for A £ F mx " we denote by ||A|| Sjt the operator norm of 
A, where F n , F m are equipped with the norms £ s ,£t respectively. Note that 
II -^11 2,2 = o~i(A), see Problem 7.4.13. For m = n and s = t = p we denote 
by \\A\\ p the l v operator norm of A. 

Lemma 7.7.1 Let A = [a t j] £ F mx " and \\ ■ || , || • || b be norms on 
C",C m respectively. If \\ ■ || b is an absolute norm then 

(7.7.5) ||A|| 0i6 < \\(\\(a n , ■ ■ • ,«i„) T |i:, • • • , ||(a ml , . . . , a m „) T ||:) T || b . 
If\\ ' II a is an absolute norm then 

(7.7.6) ||A|| ai6 < \\(\\(a lu . . . ,a ml ) T \\ b , \\(a ln , . . . , a ro „) T || 6 ) T ||;. 
In both inequalities, equality holds for matrices of rank one. 

Proof. Let x £ F™. Then 

n 

(Ax), = ^aijXj => \{Ax)i\ < \\(a ll , . . . ,a m ) T ||*||x|| a , i = i,... ,m 

\Ax\ < HxHadKa^, . . . ,a ln ) T ||*, . . . , ||(a mi , . . . , a mll ) T ||;) T . 

Assume that || • ||b is an absolute norm. Then 

\\Ax\\ b < HxlUdKa^, . . . ,a ln ) T ||;, . . . , ||(a mi ,.. . , a mn ) T \\* a ) T \\ b , 

which yields (7.7.5). Suppose that A is rank one matrix. So A = uv T , 
where 0^ueF m ,0^veF™. There exists ^ x £ F" such that 
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v T x = ||v||*||x|| . For this x we have that \Ax\ = |jv||* |jx|| a |u|. Hence 
Ujgf = ||v||*||u|| & . Thus \\A\\ afi > \\v\\:\\u\\ b . On the other hand the 
right-hand of (7.7.5) is ||v||*||u||&. This shows that (7.7.5) is sharp for rank 
one matrices. 

Assume now that || • || is an absolute norm. Theorem 7.4.3 claims that 
|| • ||* is an absolute norm. Apply (7.7.5) to ||A T |j b » ia . and use Problem le 
to deduce the inequality (7.7.6). Assume that A is rank one matrix. Then 
A T is a rank one matrix and equality holds in (7.7.6). □ 



Theorem 7.7.2 Let m, n > 2 be integers. Assume that s,t € [1, oo] and 

suppose F™, F m are endowed with Holder with norms \\ ■ || s , || • || t respectively. 
Let s* be defined by the equality 1 + ^ = 1. Then for A = [ay ]™'" =1 F mXTl 
the following hold. 

m n n m 

(7.7.7) \\A\\., t < min((£(£ M'*)*)*, £E I^D 4 )^' 

i—1 ] — l j — 1 i—1 

m,n 

(7-7.8) \\A\U, < K-l. 

i=j=l 
m 

(7.7.9) IH|i,i= m ? ^2\aij\, 

i—1 

n 

(7.7.10) Plloo.oo = max VM, 

Ki<m 



(7.7.11) ||A|| 1)C 



max 



Proof. Since || • || s , || • || t are absolute norm the inequalities (7.7.5) and 
(7.7.6) hold. As || • ||* = || • || s . we deduce (7.7.7). For s = oo we have 
s* = 1, and for t = 1 (7.7.7) yields (7.7.8). 

Assume that s = t = 1. So s* = oo. The second part of the in- 
equality (7.7.7) yields the inequality ||^4||i,i < maxi< 3 < n YllL i l a »j|- Let 
e j = (^jn • • • i &jn) T ■ Clearly \\ej\\-i_ = l and || ^e^- 1| x = Y^IiLi \ a ij\- Hence 
ll^lli,i >YJiLi \ a ijV So Plli,i > max i<j<nE™i \ a ij\, which yields (7.7.9). 
Since ||^4||oo,oo = ||^4 T ||i,i, sec Problem lc, we deduce (7.7.10) from (7.7.9). 

Let s — 1, t = oo. Then (7.7.7) yields the inequality 
IHIi.oo < maxi<i< mi i<j< n = \a iin \. Clearly H-Aejjoo = (a^-J. 
Hence ||^4|| i )00 > (a^J, which proves (7.7.11). □ 
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Theorem 7.7.3 Let V be a finite dimensional vector space over C with 
an norm || • ||. Let || • || be the induced operator norm Horn (V, V). Then 
for A e Horn (V, V) the inequality p(A) < \\A\\ holds. 

Proof. Assume that A € spec A. Then there exists ^ x S V such 
that Ax = Ax. So \\A\\ > ^ = |A|. □ 



Problems 

1. Show 

(a) The equality (7.7.2). 

(b) For each t > there exists a vector y £ V a , ||y||f, = t such that 

imu = ™. 

(c) ||T|| 0i6 = maxf eS|l . l|£ ,xeS||.|| \f(Tx)\. 

(d) Denote by ||T*|| 6 . >0 . the operator norm of T* : V* b -> V* 
with respect to the norms || • ||^, || • ||* respectively. Show that 

rika* = \\T\\ a , b . 

(e) ||A T || b ,, a . HIAIk^for AeF mx ". 

2. Show the inequality (7.7.3). 

3. Let T : V — > V be a linear transformation on a finite dimensional 
vector space V over C with a norm || • ||. Show that p(T) < ||T||. 

4. Let A e C nxn . Then A = Q _1 (A + N)Q, where A is a diagonal 
matrix, N strictly upper triangular and A+ N is the Jordan canonical 
form of A. Show 

(a) A is similar to A+tN for any ^ t e C, i.e. A = Q^ 1 (A+tN)Q t . 
Show that Qi = QZ) t for an appropriate diagonal matrix D t . 

(b) Let £ > be given. Show that one can choose a norm || • || 4 on 
C n of the form ||x|| t := ||Q t x|| 2 for |i|-small enough such that 
milt < p(A)+e. Hint: Note that ||A+tJV|| 2 < ||A|| 2 +|f|||JV|| 2 = 
P (A)+t\\N\\ 2 .) 

(c) If N = then ||A|| = p(A) where ||x|| = |Qx| 2 . 

(d) Suppose that each eigenvalue A of modulus p(A) is geometrically 
simple. Then there exists \t\ small enough such that \\A\\ t = 
p(A). 
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5. Let A e C nx ™. Then there exists a norm on || • || on C™ such that 
p(A) = \\A\\ if and only if A each eigenvalue A of modulus p{A) 
is geometrically simple. Hint: Note that if p(A) = 1 and there 
is an eigenvalue A, |A| = 1 which is not geometrically simple, then 
A" 1 , to G N is not a bounded sequence. 

6. Assume that || • • • || a , || • || b are two absolute norm on C™ and C m . 
Assume that QiC nxm , Q2 € C™ x " are two diagonal matrices such 
that the absolute value of each diagonal entry is 1 . Show that for any 
AeC mx " ||QiAQ 2 || , 6 = ||A|| , 6 . 

7. Show 

(a) Show that if A e K™ xn or -A e R+ Xn then PU^i = £™'" = i \aij 

(b) Let x = (x 1 , . . . ,x n ),y = (y 1: . . -,y n ) T & K n - We say that y 
has a weak sign pattern as x if yt — for Xi — and ytXi > for 
x t ^ 0. Let A e R mxn . Assume that there exists x e K" such 
that each row of A cither or — has a weak sign pattern 
as x. Then P||oo,i = E™'"=i l a< -?'l- 



(c) Let A 



Assume that 011,012,021,-022 > 0. 



Oil a i2 

021 022 

Show that || AHoo^ < «n + a 12 + a 2 i - a 2 2- 
(d) Generalize the results of Problem 7c to A e C 



7.8 Tensor products of convex sets 

Definition 7.8.1 Denote V^&e a finite dimensional vector space over 
F = R, C for i = 1, . . . , to. Let V = ®j =1 V,-. Assume that Xi C Vj for 
i = 1, . . . , to. Denote 

Q?=iX t : = iXi, /or a// ^ £ I„ i = 1, . . . , to}. 

VFe caiZ 0™ jXj se£ tensor product of X\, . . . , X m , or simply tensor product 
of X 1, ... , X m . 

Lemma 7.8.2 Let Ci be a compact convex set in a finite dimensional 
vector space Vj for i = 1, . . . , m. Then conv 0™ 1 Cj is a compact convex set 
in 0^LiVj, whose extreme points are contained in 0™ (Ci). In particular, 
if C\ and C2 are polytopes then conv Ci C2 is a polytope. 

Proof. Since Cj is compact for i = 1, . . . , to, it is straightforward to 
show that C := 0^LiCi is a compact set in V = 0™ 1 V i . Hence convC is 
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compact. Let dim Vj — di. Since d is compact, Theorem 7.1.3 implies that 
each Xj g d is of the form Xj = X^*=i a ijiYiji where a i j i > 0,yij i g £(Cj) 
for jj = 1, . . . , di + 1, and Xy-=i a *ji = 1- Hence 

d 1 +i,...,d I71 + i m 

®™ix,= X! (II ^x®^**)- 
ji=---=j m =i j=i 

Note that Jlti a y'i — an d Sjii^jJLi 1 Ili^i a *ji = 1- Hence C C 
convO™ 1 5(Ci), which implies convC C conv0™ 1 f(Ci) C convC. 

Assume that C\, C2 are poly topes. Then conv C1QC2 is nonempty, com- 
pact and convex whose set of extreme points is finite. Hence conv C\ Ci 
is a poly tope. □ 

See Problem 1 for en example where £(C\ C2) is strictly contained in 
£(Ci) £ (C2). The next two examples give two important cases where 
£(Ci©C 2 ) = £(Ci)©£(C 2 ). In these two examples we view C pxq ®C mxn 
is viewed as a subspace of C pmX9 ™, where the tensor product of two matrices 
A B is the Kronecker tensor product. 

Proposition 7.8.3 Let m,n > 2 &e integers. Then Cl m f2„ C f2 m „, 
and £ (conv fi m fi„) = V m 

Proof. Let A — [ay] G fi n , £> = [6 pg ] g fi n . The the entries of ^40 B = 
[ c (i,p)(j,q)] e R+ nxmn , where C( iiP )( Ji9 ) = ay& pg . Clearly 

mm mm 

c (i,p)0',g) = a ijbpq = bpq, c (i,p)(j,q) = a ijbpq = b pq , 

j — 1 j — 1 i— 1 i— 1 

n n n n 

C (i,p)(j,<?) = a ijbpq = dij, C (i,p)(j,q) = O-ijbpq = dij 

q—1 q—1 p—1 p—1 

Hence ^4 £> g fi mn , where we identify the set (m) x (n) with (mn). 
Since Q mn is convex it follows that convf2 m Q n C fi m n- Recall that 
£(fi mn ) = P mn . Clearly V m Q V n C V mn . Problem 7.1.2 yields that 

slconvn m ®n n ) = r m &r n . □ 



Proposition 7.8.4 Let m 1 n>2 be integers. Then H mj+; i H„ i+i i C 
Hm«,+,i. and £(convH TOi+; i ©H„ i+i i) = £(H m>+i i) ©£(H ni+; i). 

Proof. Let A g H TO , B g if n be nonnegative definite hcrmitian matri- 
ces. Then A _B is nonnegative definite. Since tr A £> = (tr A)(trS) it 



7.8. TENSOR PRODUCTS OF CONVEX SETS 



379 



follows that H TOj+) i 0H„ iW C H TO „ j+i i. Hence £(convH mi+: i ©H„ i+i i) C 

£(H-m,+,i) ^(H n ,+,i)- 

Recall that £(H„ l!+ .i),£ (H„ :+i i),f(H m „ !+i i) are hermitian rank one 
matrix of trace 1 of corresponding orders. Since A B is a rank one matrix 
if A and B is a rank one matrix it follows that £ (H mj+i i) £(H„ i+j i) C 
£(H m „, +i i. Hence £(conv H m , +!l H„ i+;1 ) = £(H TOi+il ) ©£(H ni+;1 ). □ 

Problem 7.8.5 Let Ci be a compact convex set in a finite dimensional 
space Vi for i — 1,2. Suppose that Ci — n ae jr i H(/ a , x Q ), were Ti is the 
set of all supporting hyperplanes of Ci which characterize Ci, for i = 1,2. 
(Ti may not be finite or countable.) The problem is to characterize the set 
of all supporting hyperplanes o/convCi C2. 

Equivalently, suppose we know how to decide if x^ belongs or does not 
belong to Ci for i = 1,2. How do we determine if x belongs or does not 
belong conv C\ C'2 ? 

It seems that the complexity of characterization conv C\ C2 can be much 
more complex then the complexity of C\ and C2. We will explain this 
remark in the two examples discussed in Propositions 7.8.3 and 7.8.4. 

Consider first H m + i t So any element in H m>+ i_ 

is of the form 4®6, where A and B are nonnegative definite hermitian 
matrices of trace one. The matrix A B is called a pure state in quantum 
mechanics. A matrix C G conv_ff m!+li -ff n ,+,i is called a separable state. 
So conv_ff m , + i : is the convex set of separable states. The set 

H m „ i+ i.i\ conv .ff m .+i. QH n , +t i the set of entangled states. See for example 
[BcZ06]. The following result is due to L. Gurvits [Gur03] 

Theorem 7.8.6 For general positive integers m,n and A e ¥L mn _ + ,i 
the problem of decision if A is separable, i.e. A G 
NP-Hard. 

On the other hand, given a hermitian matrix A e H„, it well known that one 
can determine in polynomial time if A belongs or does not belong to H n + 1 . 
See Problem 3. We will discuss the similar situation for convf2 m Q, n in 
the §7.11. 

Definition 7.8.7 Let Vi be a finite dimensional vector space over F = 
R, C with a norm || • ||j for i = 1, . . . ,k. Let V := 0fL 1 Vj with the norm 
|| • || . Then || • || is called a cross norm if 



(7.8.1) 



k 

ii®tixiii=nwii 

i— 1 
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for all rank one tensors. 

Identify (gi^V? with V*, w/iere (®*L 1 f 4 )(®f =1 x 4 ) = Ui=Ji( x i)- Then 
|| • || is called a normal cross norm if the norm \\ ■ ||* on V* is a cross norm 
with respect to the norms \\ ■ \\* on V* for i = 1, . . . , k. 

See [Sch50] for properties of the cross norms. We discuss the following 
known results needed in the sequel. 

Theorem 7.8.8 Let Vj be a finite dimensional vector space over F = 
R, C with a norm || • ||j for i = l,...,k. Let V := (gi^Vj. T/ien i/iere 
exists a norm \\ ■ | on V satisfying (7.8.7) Furthermore, there exist unique 
norms || • || max , || • \\ m in satisfying the following properties. First, || • || max 
and || • |j m i„ are normal cross norms. Moreover ||z|| m ; n < ||z|| max for all 
z G V. Any cross norm \\ ■ | on V satisfies the inequality ||z|| < ||z|| max /or 
a?/ z G V. Third, assume that || • || a on V satisfies the equality 

k 

(7.8.2) || ®* =1 f^i: - [] Hfill? for all ^ e VJ, * = i, . . . , k. 

i—i 

I.e. || • ||* is a cross norm with respect to the norms \\ ■ ||* on V* /or 

i = 1, . . . , k. Then ||z|| m in < ||z||a for all z e V. More precisely, 

(7.8.3) 

B||.|| max = convBu.Hj ... B||.|| fc , By.^ = convB||.||j ... B||.||». 

Proof. For simplicity of the exposition we let k — 2. Define the set 
B := convBji.Hj B||.|| 2 . Clearly, B is a compact convex balanced that is 
in its interior. Hence there exists a norm || • || max such that B = B||.|| max . 
We claim that ||x y|| max = llxl^ly^. Clearly, to show that it is enough 
to assume that ||x|| x = ||y|| 2 = i. Since x y e B||.|| max we deduce that 
Il x 0y||max < i- Problem 7.4.5c yields that there exists f G S||.||*,g € S||.||* 
such that 

1 = f(x) > |f( Xl )|, Vx, e B| MU) i = g(y) > |f( yi )|, V Yl G B| MU . 
Hence 

(7.8.4) 1 = (f g)(x y) > |(f g)(z)| for all z G B| H | max . 

Hence x0y G <9B||.|| max = S||.|| max , i.e. ||x0y|| max = i. Therefore the norm 
II • Umax satisfies (7.8.1). Let || • || be another norm on V satisfying (7.8.1). 
Hence B^mOBn.^ C B||.|| , which yields By. j| max = convB^.^QB^.^ CB||.||. 
Therefore, j|z|| < ||z|| max . 

We next observe that that || • || c := || • || max on V* satisfies the equality 



(7.8.5) 



f 0g||c = 
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Recall that ||f <g> g|| c = max ze B| M | |(f <8> g)(z)|. Use the (7.8.4) to deduce 
(7.8.5). Hence ||z|| max is a normal cross norm. 

Let || • ||ft be the norm given by the unit ball convB||.||. ©B||.||. on V*. 
Hence the above results show that || • || b is a normal cross norm. Recall 
that for any norm || • ||* on V* satisfying (7.8.5) we showed the inequality 
||h||* < ||h|| b for any h e V*. Define || • || min := ||- ||J. Hence ||z|| min < ||z|| . 
The previous arguments show that || • || m ; n satisfies the equality (7.8.1). 
Hence ||z|| m i„ < ||z|| max for all z e V. Also ||z|| m i„ is a normal cross norm. 

□ 



Proposition 7.8.9 Let k > 1 be an integer. Assume that V 1; . . . , Vfe 
are inner product spaces over F = R, C. Let V = ®i =1 Vj with the inner 
product induced by the inner products on V l7 . . . , Vfc. Assume that \\ ■ 
||i, . . . , || • ||fc, || • | are the induced norms by the corresponding inner products 
on V 1; . . . , Vfe, V. Then || • || is a normal cross norm, //dim V, > l for 
i = l,...,k then || • || is different from || • \\ max and \\ ■ \\ min . 

See Problem 7. 

Theorem 7.8.10 Let U,V be finite dimensional vector spaces over 
¥ = R,C with norms \\ ■ . || • ||| respectively. Identify W = V ® U* with 
Horn (U, V), via isomorphism 9 : W — > Horn (U, V), where 9(v (g) f)(u) = 
f(u)v for any f G U* . Then the minimal cross norm on \\ ■ \\ m - m on W 
is the operator norm on Horn (U, V), where the norms on U* and V are 
|| • ||* and HI • HI respectively. Identify W* with V* ® U — Horn (U*, V*). 
Then the maximal cross norm || • || max on W is the conjugate to the operator 
norm on Horn (U*, V*), which is identified with W*. 

Proof. Let T E Horn (U, V). Then 

||T||= max |||T(u)|||= max |g(T(u))|. 

u£b||.|| geS|||.|||.,ueb| M | 

Let 6* _1 : Horn (U, V) — > V(g)U* be the isomorphism given in the theorem. 
Then g(T(u)) = (g <g> u)(6» _1 (T)). Let || • || b be the norm given by the unit 
ball convB|||.|||. ©B||.j| onV*®U~ W*, as in the proof of Theorem 7.8.8. 
Then 

||fl- 1 (T)||S= ps max =|(g®u)(0- 1 (T))|. 

geS|||.|||.,ueS||.n 

Use the proof of Theorem 7.8.8 to deduce that ||T|| = ||</>(T)|| min . 

Similar arguments show that the conjugate norm to the operator norm 
of Horn (U*, V*), identified with V*®U~ W* gives the norm || • || max on 
W. □ 
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Use the above theorem and Problem 7.4.14 to deduce. 

Corollary 7.8.11 Let U, V be finite dimensional inner product spaces 
over F = R, C, with the corresponding induced norms. Identify W = V 
U* with Hom(U,V) as in Theorem 7.8.10. Then ||T|| min = a x {T) and 

imux = E-i I ? Vi ^co- 

More generally, given finite dimensional vectors spaces Uj , V» over F = 
R, C for i = 1, . . . , k we identify the tensor spaces 0*L 1 Hom (Uj, Uj) with 
Horn (®^ =1 Uj, ®i =1 Vj) using isomorphism 

4 : of =1 Hom (Uj,Vj) -> Horn (0f =1 Uj, 0f =1 Vj) satisfying 
(7.8.6) i{® k i= iTi){®U^i) = ^.t.mui) 

where Tj € Horn (Uj, Vj), Uj S Uj,i = i,...,fc. 

Theorem 7.8.12 Lei Uj, Vj are /imie dimensional vector spaces over 
F = R, C wii/i i/ie norms \\ ■ ||j,||| • |||j respectively for i = l,...,k. Let 
Nj(-) &e t7ie operator on Hom(Uj,Vj) for i = l,...,k. Let \\ ■ || max be 
the maximal cross norms on U := 0f =1 Uj and ||| • ||| oe any cross norm 
on V := 0jL 1 Vj. Then the operator norm N(-) on Horn (U, V), identified 
with (gi-LjHom (Uj, Vj), is a cross norm witTi respect to the norms Nj(-), i = 
l,...,fc. 

Proof. Since B||.|| max = 0^ =1 B||.|| . we deduce that for any T € Hom(U, V) 
one has 

N(T)= max |||T(0*L lU j)|||. 

u 1 eB| M |., 4 e(fc) 

Let T = ®^ =1 Tj. Since ||| • ||| is a cross norm on V we deduce 

k k 

N(T)= max |||0* =1 Tj(uj)||| = max JJ |||Tj(uj)|||j = J] N*(Tj). 
u i eB||.n i ,»e<fe) "ieB| M | 4 ,»e<fc>^ 

□ 



Problems 

1. Let V^Vj be one dimensional subspaces with bases v l7 v 2 respec- 
tively. Let C\ = [—e 1 ,2e 1 ],C 2 = [— e 2 ,3e 2 ]. Show that Ci C2 = 
[— 3(ej e 2 ),6e 1 e 2 ]. Hence £{C\ C 2 ) is contained strictly in 
£(Ci)0£(C 2 ). Lemma 7.8.2 yields that convCi0C 2 = conv V m QV n . 
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2. Index the entries of C € f2 m „ as C(%,p),{j,q)- Show 

(a) Assume that C G convf2 m Cl n . Then the entries of C satisfy 

m m 

^2 c (i,p)U,q) = ^ c (*,p)(j,?) for each G ( TO >> P'9 e (n>, 

i=l j=l 
n n 

^2 c (i,p)U,q) = ^2 c (h P )U,q) for each hi e (w) p,ge (n). 

p=l q=l 

(b) For m = n = 2 the standard conditions for 4 x 4 doubly stochas- 
tic matrices, and the conditions in part 2a characterize the set 
conv f2 2 © ^2- 

(c) For m = n > 4 the standard conditions for n 2 x n 2 doubly 
stochastic matrices, and the conditions in part 2a gives a set 
which contains strictly conv Cl n QCl n . Hint: Consult with [Fri08]. 

3. Show 

(a) A e H„ i+i i if and only if tr A = 1 and A > 0. 

(b) A e H„ ;+ dct A > 0. 

(c) Assume that A = [a y ]^ =J=1 G H„ and det A > 0. Then A > if 
and only det [a^]? - =1 > for p = 1, . . . , n — 1. 

(d) Assume that A = [a y ]f =i=1 G H„ and dct A = 0. Find O/xe 
C n such that ^4x = 0. Let x„ = jrarx and complete x„ to an 

orthonormal basis x 1; . . . ,x„. Let = [x*Axj)"~ J 1 =1 . Then 

G H„_i. Furthermore, ^4 > if and only if A n -i > 0. 

4. Let t : C" xn be the transpose map: t(A) = A T . Show 

(a) t(A) is similar to A for any A e C" x ™. 

(b) t leaves invariant the following subsets of C nxn : 

R" XI \ S n (R), S n (C), 0(n, K), 0(n, C), 
U(n, K), N(n, R), N(n, C), H„, H n ,+ , H n , +il . 

5. On C m,ixmn viewed as C mxm <X>C" X ™ we define the partial transpose 
r par as follows. Let C = [c ( ^), ( ,- 9) ]™'^" 9=1 ] e C mxm C" x ". 
Them r par (C) = [c (iiP ) i(j;?) ]™'™;pf 9=1 ], where c ( ; iP ) i(ji?) = c (i;9)i0iP) 
for i, j G (m),p,q G (n). Equivalently, T par is uniquely determined by 
the condition r par (A B) = A <g> B T for any A G C mxn ,B G C" xrl . 
Show 



384 



CHAPTER 7. CONVEXITY 



(a) r par leaves the following subsets of C" lTlxm ™ invariant: S mn (K),Hi, 
and all the set of the form X Y, where X C C mxm and 
Y <- pnxn are gj ven m Problem 4b. In particular the convex set 
of separable states convH mj+i i H„ i+; i is invariant under the 
partial transpose. 

(b) Show that for m = 2 and n = 2,3Ce H m „ is a separable state, 
i.e. C £ convH m , +i i 0H„ j+4 , if and only C, r par (C) £ H m „ )+i i. 
(This is the Horodecki-Peres condition [Hor96, Pcr96].) 

6. Let the assumptions of Theorem 7.8.8 hold. Show 

(a) Each z £ V can be decomposed, usually in many ways, as a sum 
of rank one tensors 

N 

z = ^2<s>i =1 y.j,i, £ Vi, i= i,...,k, j = i,...,N, 

where N = rii=i dim Then ||z|| max is the minimum of 
X^Li Ili=i ll x j.illi over a ^ tne above decompositions of z. 

(b) 

IM|min= R max |(®? =1 fi)(z)|. 

fiGB||.||» ,4=1,. ...k 

7. Prove Proposiiton 7.8.9. Hint: To prove the first part of the problem 
choose orthonormal bases in V 1; . . . , Vfe. To prove the second part 
observe that || • || is smooth, while || • || m ; n , || • || max arc not smooth if 
dim Vj > l for i = 1 , . . . , k > 1 . 

7.9 Variation of tensor powers and spectra 

Definition 7.9.1 Let V 1; V 2 be finite dimensional vector spaces over 
F = M, C with norms || • ||i,|| • ||2 respectively. Let ji : Vj — ► V 2 be a 
nonlinear map. The map [i has a Frechet derivative at x £ U, or simply 
differentiable at x, if there exists a linear transformation T x £ Horn (U, V) 
such that 

/li(x + u) = (j,(x) +T x u + o(u)\\u\\ 1 , 

where ||o(u)|| 2 — > o uniformly as \\u\\i. — » o. Denote D/i(x) := T x . ^ is 
differentiable, if it has the Frechet derivative at each x G V 1; and D/i(x) is 
continuous onV x . ( Note that by choosing fixed bases in V x , V 2 each D/i(x) 
is represented by a matrix A(x) — [oij(x)] e F mxn ; w/iere n = dim V 17 m = 
dim V 2 . Then ajj(x) is continuous on V 1 for each i £ (m),j £ (n).) 
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Since all norms on a finite dimensional vector space are equivalent, it is 
straightforward to show that the notion of Frechet derivative depend only 
on the standard topologies in V 17 V 2 . See Problem 1. For properties of the 
Frechet derivative consult with [Die69]. 

Proposition 7.9.2 Le£V l7 V 2 be finite dimensional vector spaces over 
F = R,C with norms || • ||i, || • || 2 respectively. Assume that [i : V\ — > 
V 2 is differentiable. Then for any x,y G V\ the following equality and 
inequalities holds. 

(7.9.1) /i(y)-/x(x)= [ D/i((i-t)x + ty)(y-x)dt, 

J o 

(7.9.2) ||/i(y) - /i(x)|| a < Hy-xll, f ||DM(i-i)x + ty)|| li2 di 

«/ O 

< l|y- x lli max ||D/x((i-t)x + ty)|| 1 . 2 . 
te[o,i] 

(7.9.1) and (7.9.2) are called here the mean value theorem and the mean 
value inequalities respectively. 

Proof. Let x, u G V x be fixed. Clearly, the function ^((x + tu) is a 
differentiable function from R to V 2 , where 

(7.9.3) ^ X ^ U) =D M (x + fu)u. 

Letting u = y — x and integrating the above inequality for t G [0,1] we get 
(7.9.1). Replacing the integration in (7.9.1) by the limiting summation and 
using the triangle inequality we obtain 

||M(y)-M(x)|| 2 < / ||D M ((i-i)x + iy)(y-x)|| 1 ^< 

J o 

/ ||D M ((l-i)x + iy)|| 1;2 ||(y-x)|| l( ft< 
Jo 



|y - xHi max ||D^((i - i)x + fy)|| 1)2 . 
te[o,i] 



□ 



Theorem 7.9.3 Let V be a finite dimensional vector space. Let k G N. 
Denote V®* := V ® . . . ® V . Consider the map 5 k : V — > V lg,fc , w/iere 

(5fc(x) = x <g> . . . ® x. Then 

k 

(7.9.4) 

D<5fe(x)(u) — u (g) x (g) . . . (g) x +x giugixg)...g)x+... + xg)...g) xg)u. 
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Let || • || be a norm on V and assume that || • ||fc is a cross norm on V® fc : 

k 

(7.9.5) ||x! ®x 2 <g> . .. ®x k \\ k = Yl ||xi|| for Xl ,...,x fc e V. 

i=i 

Denote 6j/ N fc (T) := ||T|| | MMM | fc tfie operator norm of T e Horn (V, V 8 "-'). 
T/ien 

(7.9.6) N fc (D<y fc (x)) = fc||x|| fe -\ 

Proof. Fix x, u e V. For tei expand the vector Ofc(x + iu) in powers 
oft. Then 

6k (x + tu) = 8k (x) + t(u ® x0^8x +x (8) u <g> x ® . . ® x + 

fc— 1 fe— 2 

. . . + x ® . . . ® x ®u) + higher order terms in t. 
v ' 

fe-1 

Hence (7.9.4) holds. Apply the triangle inequality to (7.9.4) and use the as- 
sumption that || • ||fe is across norms to deduce the inequality ||D<5fc(x)(u)||fc < 
fc||x|| fc - 1 ||u||. Hence N fe (D4(x)) < feW*- 1 . Clearly, equality holds if 
x = 0. Suppose that x ^ 0. Then ||D£ fe (x)(x)|| fe = fc||x|| fe . Hence 
Nfe(D(5fe(x)) > fcjlxf- 1 , which establishes (7.9.6). □ 



Theorem 7.9.4 Let U be a finite dimensional vector space over F = 
R, C. For an integer k > 1 consider the map S k ■ Horn (U, U) — > Horn (U, U)® fc 
- Hom(U®*=,U® fc ) given by S k (T) = T®...®T. LetW k C U® fc be a 

S v ' 

k 

subspace which is invariant for each 5 k {T),T £ Horn (U,U). Denote by 
S k : Horn (U, U) — > Hom(W fe ,W fe ) the restriction map 5 k (T)\W k . As- 
sume that || • || is a norm on U. Let \\ ■ \\ k be the maximal cross norm 
U® fc Let || • ||, || • ||fe, HI • | ||fc be the induced operator norms on Horn (U, U), 
Horn (U,U)' 8lfe , Horn (W fe ,Wfe) respectively. Let N k (-), N fe (-) be the opera- 
tor norm on Horn (Horn (U,U),Hom (U, U)® fc ), 
Horn (Horn (U, U), Horn (Wfe, Wfe)) respectively. Then 

(7.9.7) Nfe(D,5fe(T)) = fc||T|| fe -\ N fc (D* fc )(T)) < fc||^|| fe - 1 

/or any T e Horn (V, V). 

Proof. Theorem 7.8.12 yields that the operator norm ||-|| fc on Horn (U®* , IT 8 '*), 
identified with Horn (U, U)® fc , is a cross norm with respect to the operator 
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norm on Horn (U,U). Theorem 7.9.3 yields the equality in (7.9.7). Ob- 
serve next that D6 k {T) is V5 k {T)\W k . Hence N k (D5 k (T)) < N(D5 k (T)), 
which implies the inequality in (7.9.7). □ 

A simple example of Wfe is the subspace /\ k U. See Problem 3. 

Theorem 7.9.5 Let U be an n- dimensional vector space over F = R, C, 
with a norm |j • || . Denote by || • || the induced operator norm on Horn (V, V). 
Then for A, B £ Horn (U, U) 
(7.9.8) 

|det A - det B\ < \\A - B\\ ~ jgjj" < n\\A - B||[max(P||, \\B\\T-\ 

Here a ^~°" := na™ _1 for any a e C. The first inequality is sharp for 
A = al n , B = bl n for a, b > 0. The constant n in the second inequality is 
sharp. 

Proof. In U®" consider the one dimensional invariant subspace W„ := 
/\™ U for each S n (T),T e Horn (U,U). See Problem 3. Let e l7 . . . ,e„ be a 
basis in U. Then e 1 Ae 2 A...Ae„isa basis vector in f\ n U. Furthermore 

S n (T)(e 1 A e 2 A ... A e„) = (det T)e ± A e 2 A . . . A e n . 

See Proposition 5.2.7. Note that S n (T) := S n (T) \ /\™U is the above opera- 
tor. Observe next that any Q e Horn (/\™ U, /\" U), is of the from 

Q(e ± A e 2 A ... A e„) = te ± A e 2 A . . . A e„. 

Hence the operator norm of Q is \t\. We now apply Theorem 7.9.4 to this 
case. The inequality in (7.9.7) yields 

Nnp^T^nHTir- 1 . 

Next we apply Proposition 7.9.2, where V 1 := Horn (U, U) and V 2 = 
Horn (/\™ U, /\ n U) equipped with the operator norms, and ^(T) = S n (T). 
So \\[i(A)- [i(B)\\ 2 = |dct A-dct B\. The inequality (7.9.2) combined with 
the inequality in (7.9.7) yield 

|det A - det B\ < n\\A - B\\ f ||(1 - t)A + tB^dt < 

Jo 

n\\A-B\\ [\(l-t)\\A\\+t\\B\\r- 1 dt = 
Jo 

II A\\ n — II Bll™ 

n A - s n liiFPi = u m ^ uir'-im' = 

n\\A-B\\[m^(\\A\\,\\B\\)} n -K 



388 



CHAPTER 7. CONVEXITY 



This shows (7.9.8). Recall that \\xl\\ = \x\ for any x e F. Hence, for 
A = al, B = bl and a, b > equality holds in the first inequality of (7.9.8). 
To show that the constant n can not be improved let A = (1 + x)I, B = I, 
where x > 0. Then (7.9.8) is equivalent to the inequality (1 + x) n — 1 < 
nx(l + a;)™ -1 . Since lim^o xli+l)^- 1 = n tne constant n can not be im- 
proved. □ 



Definition 7.9.6 Let £„ be the group of permutations a : (n) — > (n) . 
Le£ S = {Ai, . . . , A n },T = {/ii, . . . ,/x n } &e two multisets in C containing n 
elements each. Let 

dist(S, T) = max min |Aj — m\, 

j£(n) ie(n) 

hdist(S,T) = max(dist(S,T),dist(T,S)), 
pdist(S,T) = min max |A; — 

treS n ie(n) 

Note: dist(S,T) is the distance from S to T, viewed as sets; hdist(S,T) is 
the Hausdorff distance between S and T, viewed as sets; pdist(S, T) is called 
permutational distance between two multisets of cardinality n. Clearly 

hdist(S,T) = hdist(T,S), pdist(S,T) = pdist(T,S), 

(7.9.9) 

dist(S,T) < hdist(S,T) < pdist(S,T). 

See Problem 4. 

Theorem 7.9.7 Let U be an n-dimensional vector space ofC with the 
norm || • ||. Let || • || be the induced operator norm on Horn (U,U). For 
A,B& Hom(U,U) let S(A),S(B) be the eigenvalue multisets of A,B of 
cardinality n respectively. Then 

(7.9.10) pdist(S(A),S(B)) < 4e*n||A - B||» [max(||A||, HBH)]^. 
To prove the theorem we need the following lemma. 

Lemma 7.9.8 Let the assumptions of Theorem 7.9.7 holds. Define 

(7.9.11) h(A, B) := max dist(S((l - t)A + tB), S(B)). 

Then 



(7.9.12) 



pdist(S(A),S(B)) < (2n- l)h(A,B). 
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Proof. Let S(A) = {Ai(A), . . . , A n (A)}, S(B) = {Ai(B), . . . , A n (B)}. 
Let D(z,r) := {w G C, \w - z\ < r}. Denote K B = Uf =1 D(A;(B), h(A, B)). 
Then Kb is a closed compact set, which decomposes as union of a k G (n) 
connected components. Let A(t) = (l-t)A+tB. Since dist(S(A(t)), S(B)) < 
h(A,B) we deduce that K B contains S(A(t)) for each t G [0, 1]. As S(A(t)) 
various continuously for t G [0, 1], each connected component of Kb contains 
a fixed number of the eigenvalues of S(A(t)) counting with their multiplic- 
ities. Since A(l) — B, each connected component of Kb contains a fixed 
number of the eigenvalues of A and B counting with their multiplicities. 
Rename the eigenvalues of B such that indices of the eigenvalues of A and 
B are the same in each component of Kb . 

Let C = U?_ 1 D(zi, h(A, B)) be such a connected component, where 
z\,. . . ,z p are p distinct eigenvalues of B. C contains exactly q > p eigen- 
values of A and B respectively. We claim that if A G S(A) n C then 
maxj £ / p \ |A — Zj\ < (2p — l)h(A, B). Consider a simple graph G = (V, E), 
where V = (p) and G E if and only if \zi — Zj\ < 2h(A,B). Since C 
is connected it follows that G is connected hence the maximal distance be- 
tween two distinct point in G is p — 1. So \zi — Zj\ < 2{p — l)h(A,B). 
Since |A — Zi\ < h(A,B) for some i G (p), it follows that |A — zj\ < 
(2p — l)h(A,B) < (2n — \)h{A,B). Therefore for this particular renam- 
ing of the eigenvalues of B we have the inequality |Aj(A) — Xi(B)\ < 
(2n-l)h(A,B),i = l,...,n. □ 

Problem 5 shows that the inequality (7.9.12) is sharp. 

Proof of Theorem 7.9.7. First observe that 

n 

dist(S(A),S(B)) n < max | IJ(MA) - Aj(B))| = 
max |det (A, (A)I - B) - dct (A;(A)J -A)\< 

ie(n) 

max Idet (zl - B) - det (zl - A)\. 

zeC,\z\<p{A) 

We now apply (7.9.8) to deduce that for \z\ < p(A) < \\A\\ we have 

|det (zl — B) — det (zl - A)\ < n\\A - B\\[max(\\zl - A\\, \\zl - BW)}"- 1 

<n\\A- i?||[max(|z| + ||A||,(|z| + WBW)}^ 1 < 
n||A-B||[max(2|H|,|H| + ||B||)r- 1 . 

Thus 

(7.9.13) dist(S(A),S(B)) < nn||A-B||n[max(2||A||,||A|| + ||B||)]^. 
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We apply the above inequality to A(t) for each t e [0,1]. Clearly, for 
t € [0, 1] 

||A(t)||<(l-t)||A||+t||B||<max(||A||,||B||)^ 
max(2||A(t)||, \\A(t)\\ + \\B\\) < 2max(||A||, ||B||). 

Also \\A(t) -B\\ = (l-t)\\A-B\\ < \\A-B\\. Hence we deduce 

(7.9.14) h(A, B) < n^\\A- B\\^[2m&x(\A\\,\\B\\)}^ . 
Use (7.9.12) to obtain 

(7.9.15) pdist(S(A),S(B)) < (2n - l)n» ||A - B|| » [2max(|A||, ||B||)]^. 
Use the inequality 

71 1 ft 1 1 

(7.9.16) (2n- 1)2(-)" < 4n(-)« < 4ne^ for n e N, 

to deduce (7.9.10). (See Problem 6.) □ 

The inequality (7.9.10) can be improved by fact 2 using the following the- 
orem [EJRS83]. ( See Problem 7.) 

Theorem 7.9.9 Let U be an n- dimensional vector space of C. For 
A,B& Hom(U,U) let S(A),S(B) be the eigenvalue multisets of A,B of 
cardinality n respectively. Then 

(7.9.17) pdist(S(A), S(B)) < (2L^^J - 1) max(h(A, B), h(B, A)). 
The above constant are sharp. 



Problems 

1. Let n : — > V 2 be a nonlinear map. Show 

(a) Assume that fi has a Frechet derivative at x with respect to 
given two norms || • ||i, , || • ||2- Then has a Frechet derivative 
at x with respect to any two norms || • || a , , || • \\b- 

(b) Suppose that \x has a Frechet derivative at x. Then \x is contin- 
uous at x with respect to the standard topologies on V 1; V 2 . 
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(c) Assume that fi has a Frechet derivative at each point of a com- 
pact set O C Vj. Then \i : O — > V 2 is uniformly continuous. 

2. Let Ui, . . . , Ufc, V 1; . . . , Vfc be finite dimensional inner product vec- 
tor space overs F = R,C. Assume that U := ®f =1 Uj,V := ®f =1 Vj 
have the induced inner product. Identify Horn (U, V) with 

njLiHom(Ui,Vj). Show 

(a) The operator norm on Hom(U,V), with respect to Hilbcrt 
norms, is a normal cross norm with respect to the operator 
norms on Horn (Ui, Vj), the Hilbcrt norms, for i = l,...,k. 
Hint: Express the operator norm on Horn (Ui, Vj) and its con- 
jugate norm in terms of singular values of Ti E Horn (Ui, Vj) for 
i = l,...,k. 

(b) Assume that U x = V x = . . . = U fe = V fe . Let S k : Horn (U„ UJ -i 
Horn (U,U). Then N k {S k {T)) = kWTf- 1 , where || • || is the op- 
erator norm on Horn (U l7 UJ. 

3. Let U be a vector space over F = M,C of dimension n > 1. Let 
fee [2,n] be an integer. Show 

(a) Wfe := f\ k U is an invariant subspace for each Sk(T) given in 
Theorem 7.9.4. 

(b) Assume that U is an inner product space. Let T £ Horn (U, U), 
and denote by ||T|| = oi(T) > . . . > a k (T) > ... the singular 
values of T. Then 

k 

Nfc(DJ fc (T)) = J2 °i(T) ■ ■ ■ <Ji-i(T)cj i+1 {T) . . . a k (T). 
i=i 

In particular, N fc (D5 fc (T)) < ka 1 (T) k - 1 - fc||T|| fe_1 = N k (~D5 k (T)). 
Equality holds if and only if a\{T) = . . . = u k (T). Hint: Con- 
sult with [BhF81]. 

4. Prove (7.9.9). 

5. Let A = diag(0,2,4, .. .,2n-2),B = (2n - l)I n e R nxn . Show that 
in this case equality holds in (7.9.12) 

6. Using the fact that min te [ 01 ] — tlogt = \ deduce the last part of 
(7.9.16). 



7. Show 
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(a) Let n = 2k + 1 and 

A = diag(0, . . . ,0,2,4,. ..,2k), 

k+l 

B = diag(l, 3, ...,2k- l, 2fc+ 1, . . . ,2fc + l ). 

fe+i 

Then equality holds in (7.9.17). 

(b) Let n = 2k and 

^ = diag(0, ... ,0,2,4, ...,2fc), 
fe 

B = diag(l, 3, . . . , 2fc - 1, 2k + 1, . . . , 2k + 1). 

Then equality holds in (7.9.17). 

(c) max(/i(A, B),h(B, A)) is bounded above by the right-hand side 
of (7.9.14). 

(d) Deduce from (7.9.17) and the previous part of the problem the 
improved version of (7.9.10). 

(7.9.18) 

pdist(S(A),S(B)) <2c^n||A-B||^[max(||A||,||B||)] i ^. 

7.10 Variation of permanent s 

Definition 7.10.1 For A = [a i3 ] £ D nxn the permanent of A, denoted 
as perm A 

n 

perm A = ^ JJ^W' 

<reS„ i=l 

where S„ is the group of permutations a : (n) — > (n). 

The determinant and the permanent share some common properties as 
mulitlincar functions on D™ x ", as Laplace expansions. However, from the 
computational point of view the determinants are easy to compute while 
permanents are hard to compute over all fields, except the fields of char- 
acteristic 2. (Over the field of characteristic 2 perm A = dct A.) For 
A G Z™ x ™ the permanent of A has a fundamental importance in combina- 
torics, and usually is hard to evaluate [Val79] . The main aim of this section 
is to generalize the inequality (7.9.8) to the permanents of matrices. The 
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analog of (7.9.8) holds for the norms £ p ,p € [l,oo] [BhE90]. However, it 
was also shown in [BhE90] that the analog of fails for some operator norm 
on C nxn . 

Theorem 7.10.2 Let || • || be a norm on C n , and denote by || • || the 
induced operator norm on C n . Let A,B e C nxn . Then 

(7.10.1) |perm A — perm B| < 

||A-B||(||A||"-||g||") \\A*-B*U\\A*\r-\\B*\\ n ) 
2(\\A\\-\\B\\) + 2(||A*||-||B*||) 

To prove this theorem we need two lemmas. The first lemma gives the 
following formula for the standard numerical radius of a square matrix 
with complex entries. 

Lemma 7.10.3 Let A E C nxn . Then 

7 A _L_ 7 A 

(7.10.2) r 2 (A)=maxp(^^). 

\z\ = l Z 

In particular r 2 (A) < + ||-A*||) for any operator norm on C™. 

Proof. Let z e S 1 , B = zA. Assume that x is an eigenvector of \{B + 
B*) of length one corresponding to the eigenvalue A. Then 

|A| = \St(x*(zA)x)\ < \x*(zA)x\ = |x*Ax| < r 2 (A). 

Hence the right-hand side of (7.10.2) is not bigger its left-hand side. On 
the other hand there exists x € C™,x*x = l and z S C, \z\ = 1 such that 
r 2 (A) = |x*Ax| = x*(zi)x. For this value of z we have that 

, ,zA + zA x ,zA + zA x 
r 2 (A) < Ai( ) < P { ). 



Clearly, 



{ z_A + zA zA + zA \\A\\ + \\A\\ 
P\ 2 > ~ 11 2 11 ~ 2 



Hence r 2 {A) < \{\\A\\ + \\A*\\). □ 



For A e C™ x ", view the matrix ® n A as a linear operator on <g> ra C n , 
which is identified with C™ . dj 2 (®"^4), r2{® n A) are the numerical range 
and the numerical radius of ® n A corresponding to the inner product (•, •) 
on (g)"C™ induced by the standard inner product y*x on C". 
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Lemma 7.10.4 Let A e C nxn . Then perm A e w 2 (® n A). 

Proof. Assume that e, = (Sta^, . . . , 5 in ) T , i e (an) is a standard basis 
in C". Then ®" =1 e^, where ij e = 1, • ■ • ,n, is the standard basis in 

(g)™C n . A straightforward calculation shows that 

(7.10.3) (<8> n Ax,x) = perm A, x = V ®? =1 e ff(<) , (x,x) = i, 

where £„ is the set of permutations a : (n) — > (ra). (Sec Problem 1.) Hence 
perm A e a) 2 ((Xi n A). □ 



Proof of Theorem 7.10.2. Since x given in (7.10.3) does not depend 
on A we deduce perm A — perm B <E W2(<8> n A — <g> n B). Let ||-|| the maximal 
cross norm on ®"C™ induced by the norm || • || on C™. Denote by ||| • ||| the 
operator norm on ||| • ||| on (gi n C nxn , induced by the norm ||| • ||| on ®"C™. 
Use the definition of T2{® n A — <£>"£?) and Lemma 7.10.3 to deduce 

(7.10.4) |perm A - perm B| < r 2 (<g> n A - <g) n B < 

^(||| ®" A - ® n B||| + HI ® n A* - ® n B*\\\). 

Theorem 7.8.12 implies that the operator norm ||| • ||| on <g> n C" xn , induced 
by the norm ||| • ||| on (g) ra C™, is a cross norm with respect to the operator 
norm || • || on C™ x ". Observe next 

n-1 

(g> n A - (g> n B = ^2(&B) ®{A-B) (g)™- 1 - 1 A. 

(Here (& A means that this term does not appear at all.) Use the triangular 
inequality and the fact that the operator norm 1 1 1 • 1 1 1 is a cross norm we 
deduce 

n-1 



\\\® n A-® n B\\\ < J2\\\(® lB )® ( A ~ B ) ® n ^ l A\\\ < 

i=0 

"gV A-BiMr-"- 1 - 1 -^ 1 ! 1 ;-, 1 ^ - 

Apply the above inequality to (7.10.4) to deduce the theorem. □ 



Problems 



7.10. VARIATION OF PERMANENTS 
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1. Prove (7.10.3). 

2. Let the assumptions of Theorem 7.10.2 hold. Show that 
(7.10.5) | p ermA-pcrmB|<||A-B||Mp-E 

in the following cases. 

(a) A = A*,B = B*. 

(b) The norm || • j| on C" is || • || 2 . 

3. Show that (7.10.5) holds for the norms || • ||i, || • using the following 
steps. 

(a) For A = [ 0tf ] G C n denote \A\ := [|ay|] G M" xrl . Show 

n n 

|perm A| < perm |A| < J^J ^ |aji| 
i=i j=i 

(b) Let A = [a 1; . . . ,a. n ],B = [b 15 ... ,b„] G C™ x ", where a^b; are 
the i — th columns of A, B respectively, for i = 1, . . . , n. Let 

Co = [b-x — b 15 a 2 , . . . , a n ] , C„— i — [h 1 , . . . , h n ± , a n — b„], 
d = [b 1; . . . ,bi,a i+1 - b i+1 ,a i+2 , . . . , a„], for i = l, . . . , n - 2. 

Then 

n 

perm A — perm B = ^ perm Q_i =^> 

i=l 
n 

|perm A — perm B| < perm |Ci_i|. 

i=l 

(c) Recall that ||A||i = || \A\ || =max je(n) Ha;^. Then 

ii a-i ii < iia 4 -b 4 || 1 ||i?irr i piir j < iia-biushi- 1 



n—t 
l > 



for i = 1, . . . ,n. Hence, (7.10.5) holds for || • ||i norm. 

(d) Use the equalities perm A T = perm A, || A||oo = |A T |i deduce 
that (7.10.5) holds for || • norm. 
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7.11 The complexity of convQ n fl m 

In this section we show that there a linear programming problem on Q n ,m '■= 
convf2„ Ct m , whose solution gives an answer to the subgraph isomor- 
phism problem, that will be stated precisely below. The subgraph isomor- 
phism problem belongs to the class of ./VP-complete problems [GaJ79] . This 
shows, in our opinion, that the number of half spaces characterizing fl n . m 
is probably not polynomial in max(m,n), which is analogous to Theorem 
7.8.6. 

By graph G in this section we mean an undirected simple graph on the 
set of vertices V and the set of edges E. Here E is a subset of unordered 
pairs P(V) : {(u,v),u /n€ V}, where (u, v) and (v,u) are identified. We 
will denote by G — (V, E) the graph to emphasize the set of vertices and 
edges of G. The degree of v G V, denoted by deg v is the number of edges 
that re connected to v, i.e. (v,u) G E\\. Clearly, ^2 veV deg v = 2#E. 

A vertex v G V is called isolated if deg v — 0. Denote by V; so the set of 
isolated vertices in V. A subgraph of G\ = (Vi,Ei) of G is given by the 
condition Vx C V, E 1 C E n P(V). 

Definition 7.11.1 Let G = (V,E),G' = (V',E') be two undirected 
simple graphs. Then G and G' are called isomorphic if the following con- 
dition hold. There is a bijection <p ■ V\{Vi so } — ► ^'\{VJg Q } such that 
(u,v) € E if and only if (<f>(u), <fi(v)) G E'. G' is called isomorphic to a 
subgraph of G if there exists a subgraph G\ of G such that G' is isomorphic 
to Gi. 

We note that our definition of isomorphisms of two graphs arc slightly dif- 
ferent from the standard definition of graph isomorphism. Since the set of 
isolated vertices in graph are easily identified, i.e. (#V^) 2 steps, from the 
complexity point of view our definition is equivalent to the standard defi- 
nition of graph and subgraph isomorphisms. We recall that the subgraphs 
isomorphism problem, which asks asking if G' is isomorphic to a subgraph 
of G, is an iVP-complete problem [GaJ79] . 

We now relate the SGIP to certain linear programming problems on 
Qm,n- We first recall the notion adjacency matrix of G. Assume that 
#y = m and label the vertices in V as 1, . . . , m, i.e. we let #V" = (to). 
Then the incidence matrix A(G) = [aij]^lj =1 G {0,l} mxm is a symmetric 
matrix with zero diagonal such that = 1 if and only the edge 
is in E. Note that a different labeling of the elements of V gives rise 
to the adjacency matrix A' = PA(G)P T for some permutation matrix 
P G V m . Thus the graph G gives rise to the conjugacy class of matrices 
A(G) = {PA(G)P T , P G V m }. The following result is straightforward, see 
Problem 1. 
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Lemma 7.11.2 LetG = (V,E),G' = (V',E') are two undirected graphs. 
Assume that #V = #V . Then G and G' are isomorphic if and only if 
A{G) = A{G'). 

We next introduce the notion of the vertex-edge incidence matrix B(G) G 
{0,l} #yx#E . Assume that G = (V,E) and let to = #V,n#E. Label the 
vertices of V and E by (to) and (n) respectively. Then B(G) = [&jj]£L'™ =1 G 
{0, l} mx,l ; such that bij = 1 if and only the edge j contain the vertex i. A 
different labeling of V and E gives rise to the vertex-edge incidence matrix 
B 1 = PB(A)Q for some P G V m ,Q G 7>„. Thus the graph G gives rise to 
the equivalence class of matrices B{G) = {PB(G)Q, P G P TO , Q G "P„}. 

Lemma 7.11.3 LeiG = (V, E),G' = (V',E') are two undirected graphs. 
Assume that #V = #V',#E = #£". Then G and G' are isomorphic if 
and only if B(G) =B(G'). 

We now restate the SGIP in terms of bilinear programming on f2 m x f2„. 
It is enough to consider the following case. 

Lemma 7.11.4 Let G' = (V',E'),G = (V,E) and assume to' := 
#V < to := #V,n' := < n := #£?. Lef B(G') e {0, l} m ' xn ' , B(G) G 
{0,l} mxn be the vertex-edges incidence matrices of G' and G. Denote by 
G(G') G {0, l} mxrl the matrix obtained from B(G') by adding additional 
to — to' and n — n' zero rows and columns respectively. Then 

(7.11.1) max tv(C(G')QB(A) T P) < 2n'. 

Equality holds if and only if G' is isomorphic to a subgraph of G. 

Proof. Let B 1 = [6ij,i]^=i := P T B(A)Q T G B(G). Note that B x 
has exactly the same number of ones as B(G), namely 2n, since each edge 
is connected is connected to two vertices. Similarly C(G') = [cy]™'™^ 
has exactly the same number of ones as B(G'), namely 2n'. Hence the 
(G(G'),Bi) = tr(G(Gi)B7) < 2n'. Assume that tr(G(Gi)B ] r ) = 2n'. So 
we can delete 2(n — n') ones in B\ to obtain G(G'). Note that deleting 
2(n — n') from means to delete n — n' edges from the graph G. Indeed, 
assume Qj = = 1. So the vertex i is connected to the edge j. Hence 
there exists another vertex i' ^ i such that c^j = 1. As tr (C(G')Bj) = 2n' 
we deduce that = 1. Hence, if we rename the vertices and the edges 
of G corresponding to B\ we deduce that G', represented by the matrix 
B(G'), is a subgraph of G. □ 

We now show how to translate the maximum in (7.11.1) to linear pro- 
gramming problem on Q m n . As in §2.8 for F G R nxm let F G 



nni 
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be a vector composed of the columns of F, i.e. first we have the coor- 
dinates of the first column, then the coordinates of the second column, 
ancPthe last n coordinates are the coordinates of the last column. Hence 
XFY = (Y T (g> X)F, where Y T ® X is the Kronecker tensor product. 

Lemma 7.11.5 Let C,B € R mxn . Then 

(7.11.2) max tr(CQB T P) = max (C) T ZB. 

Pev m ,Qev n zen n , m 

Proof. Since fl T = ft and £(£l m ) = V m we deduce 

max tr(CQB T P) = max tr(CYB T X). 

Pev m ,Qev n xen m .yen n 

Observe next that 

tr(CYB T X) = tr(C T (X T BY T )) = (C) T (Y ® X T )B. 
As Cl n , m = convfi„ Q m we deduce (7.11.2). □ 



In summary we showed that if we can solve exactly the linear pro- 
gramming problem (7.11.2), using Lemma 7.11.4 we can determine if G' 
is isomorphic to a subgraph of G. Since the SGIP is NP-complete, we 
believe that this implies that for general to, n the number of half spaces 
characterizing fi„ im can not be polynomial. 



Problems 

1. Prove Lemma 7.11.2. 

2. Prove Lemma 7.11.3. 



7.12 Vivanti-Pringsheim theorem and appli- 
cations 

We start with the following basic result on the power series in one complex 
variable, which is usually called the Cauchy-Hadamard formula on power 
series [Rem98, §4.1]. 
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Theorem 7.12.1 Let 



OO 



(7.12.1) 



/(z)=^>^ 



a,eC,i = 0,l,... 



and z e C, 



i=0 



be power series. Define 



(7.12.2) 



R=R(f) := 



1 



T G [0,00]. 



limsup i | Oi| < 



(7?-zs called the radius of convergence of the series.) Then 

1. For R = £/ie series converge only for z = 0. 

2. For F = oo t/ie series converge absolutely and uniformly for each 
z G C, and /(z) is an entare function, i.e. analytic on C. 

<?. For R £ (0, oo) £/ie series converge absolutely and uniformly to an 
analytic function for each z, \z\ < R, and diverge for each \z\ > R. 
Furthermore, there exist (, \(\ = R, such that f(z) can not be ex- 
tended to an analytic function in any neighborhood of (. ((is called 
a singular point of f .) 

Consider the Taylor series for the function complex valued ^—^ 



Then R = 1, the function is analytic in C\{1}, and has a singular 
point at z = 1. Vivanti-Pringsheim theorem is an extension of this example 
[Viv93, Pri94]. 

Theorem 7.12.2 Let the power series f(z) = Y^Lo a i zl have positive 
finite radius of convergence R, and suppose that the sequence Oi, i = 0, 1, . . ., 
is eventually nonnegative. (I.e. all but finitely many of its coefficients are 
real and nonnegative.) Then ( := R is a singular point of f . 

See [Rem98, §8.1] for a proof. In what follows we need a stronger version 
of this theorem for rational functions, e.g. [Fri78b, Thm 2]. Assume that 
f(z) is a rational function with as a point of analyticity. So / has power 
series (7.12.1). Assume that / is not polynomial, i.e. R(f) € (0,oo). Then 
/ has the following form. 




(7.12.3) 




P e C[z], X h b Pi ,i e C\{0}, A, ,6 h, for i^i'. 
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Note that 

(7.12.4) R(f) 



max, | Ai| ' 



Definition 7.12.3 Let f(z) be a rational function of the form (7.12.3). 
Let p := max| A .| =J j(^)-i pi. Denote 

/prin ^ 2^ (i-Xizy 

R(J) 1 and pi— p 

/prin is called the principle part of f , i.e. f — / prin does not have poles of 
order p on \z\ — R(f). 

Theorem 7.12.4 Let f(z) be a rational function of the form (7.12.3). 
Assume that the sequence of coefficients in the power expansion (7.12.1) is 
eventually nonnegative. Then 

1. The set {Ai,...,Ajv} is symmetric with respect to R. That is, for 
each i <G (N) there exists i' <E (N) such that Aj = A^. Furthermore 
Pi = pi>, and bj t i = b^ v for j = l,...pi and i = 1, . . . , N. 

2. After renaming the indices in (N) we have: Ai = j^jy; |A»| = Ai for 
i = 2,...,M, and |A,| > Ai for i > M. (Here M e [1,N].). 

3. Let p := p\. There exists an integer L G [1, M] such that Pi = p for 
i € [2, L], and Pi < p for i > L. 

4- bp.i > and there exists m e [i-,L] such that \b p ^\ — b p ,i for i = 
1, . . . , m and \b P: i\ < b P: i for i £ [m+l,L]. 

5. Let ( = e m . After renaming the indices 2, . . . , m, Ai = ( l ~ Ai for 
i = 2, . . . , m. Furthermore there exists an integer I e [1, m] such that 
b Pti = C l ^- 1 \ il fori = 2,...,m. 

6- /prin(C z ) = C~'./prinO)- 

Proof. Wc outline the major steps in the proof of this theorem. For all 
details see the proof of [Fri78b, Thm 2]. By considering g(z) — f(z) + Pi 
for some polynomial Pi, we may assume that the MacLaurin coefficients 
of g are real and nonnegative. As g pr i n — /prin, without loss of generality 
we may assume that the MacLaurin coefficients of / real and nonnegative. 
Hence f(z) = f(z) for each z where / is defined. This shows part 1. 
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Part 2 follows from Theorem 7.12.2. For simplicity of the exposition 
assume that R(f) = 1. Recall that for each singular point A,^ 1 = A, of / 
on the circle \z\ = 1 we have the equality 

(7.12.5) b pui = lim(l-r) w /(Air). 

In particular b Pl ,i — lim r yi(l — r) Pl f(r) > 0. Since b pi ,i ^ we obtain 
that b Pli i > 0. Let p := p\. Since all the MacLaurin coefficients of / are 
nonnegative we have the inequality \f(z)\ < f(\z\) for all \z\ < 1. Hence 
limsup r ^ 1 (l — r) p \f(\ir)\ < b Pi \. This inequality and (7.12.5) implies parts 
3- 4. 

For to = 1 parts 5- 6 are trivial. Assume that to > 1. Let b Pi i — 
Vib P ,i,\i]i\ = 1 for i = 2, . . . ,m. In view of the part 1 for each integer 
i £ [2,m] there exists integer i' £ [2, to] such that A, = \i',f)i = rn>. 
Consider the function 

oo 

g(z) = 2f(z) - Vi f(X iZ ) - fjifCXiz) 2(1" 

So the MacLaurin coefficients of g are nonnegative. Clearly i?(g) > 1, and 
if g has a pole at 1 , its order is at most p—l. This implies the equality 

2/prin(^) — ? 7i/prin(Aiz) — f]ifprin(^i z ) = 0. 

Therefore the set {Ai,...,A TO } form a multiplicative group of order to. 
Hence, it is a group of of all m-roots of unity. So we can rename the indices 
2, . . . , to such that \ = (^^^ for i = 1, . . . , m. Similarly, rji — 1, 772, . . . , r\ m 
form a multiplicative group, which must be a subgroup of to roots of 1. Fur- 
thermore rji 1— > Aj is a group homomorphism. This shows part 5. Part 5 
straightforward implies part 6. □ 



Definition 7.12.5 Let S = {Ai, . . . , A n } C C be a finite multiset. I.e. 
a point z £ S appears exactly m(z) > 1 times in S. Denote 

1. r(S) := max zeS 

2. For any t > denote by S(t) i/ie multiset S n {z G C, |z| = t}. 

5. For an integer k £ N denote by Sfc(S) := X^ILi ^ ^ e k ~ th moment 
ofS. Let s (S) =n. 

^. Forz= (z 1; . . . ,Z7v) T e C w denote by cr k (z) = Y,i<i 1< ...i k <N z i, ■ ■ ■ z * 
for k = 1, . . . ,N the elementary symmetric polynomials in Zi, . . . ,zjy. 
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5. Denote z(S) = (Ai,...,A n ) T e C n . Then <r k (S) := <r k (z(S)) for 
k = 1, . . . ,n are called the elementary symmetric polynomials ofS. 

S is called a Frobenius multiset if the following conditions hold. 

1. S = S. 

2. r(S) G S. 

3. m(z) = 1 for each z G S(r(S)). 

^. Assume tftaf #S(r(S)) = m. Then (S = S for ( = e^^ 1 . 

A simple example of Frobenius multiset is the set of eigenvalues, counted 
with their mulitplicities, of a square nonnegative irreducible matrix. 

Theorem 7.12.6 Let S C C be a multiset. Assume that the moments 
Sfe(S),k e N are eventually nonnegative. Then the following conditions 
hold. 

1. r(S) G S. 

2. Denote fi := m(r(S)). Assume that A G S(r(S)). Then m(A) < /i. 

5. Assume that r(S) > ant! suppose that Ai = r(S), A2, . . . , A m are all 
the distinct elements ofS satisfying the conditions |Aj| = r(S),m(Aj) = 
^ for i — 1, . . . , m. Then i = 1, . . . , m are the m distinct roots 
o/l. 

£ LetC = e 21 ^ 1 . T/ien CS(r(S)) = S(r(S)). 

5. // r(S) > 0, = 1 and none of the other elements of S are positive, 
then S is a Frobenius multiset. 

Proof. For a finite multiset S C C define 

1 00 

(7.12.6) f s (z) := £ r ^ = E s ^ S ) zk - 

AGS 1 ~ XZ k=0 

Apply Theorem 7.12.4 to deduce the parts 1-4. 

Assume that r(S) is the only positive clement of S and — If m = 1 
then S is a Frobenius set. Suppose the m > 1. Consider the function 
g(z) = 2fs(z) — fs((z) — fs(C z )- We claim that g is the zero function. Sup- 
pose to the contrary that g ^ 0. In view of 4 we deduce i?(/) < R(g) < 00. 
Since the MacLaurin coefficients of g are eventually nonnegative, Theorem 
7.12 .4 yields that g must have a singular point £ > whose residue at £ is 
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positive. Since p(A) is the only positive eigenvalue of A, all other positive 
residues of g, coming from 2/s are not located on positive numbers. Hence 
the residues of g at its poles located on the positive axes are negative in- 
tegers. The above contradiction shows that g — 0, i.e. S is a Frobenius 
multiset. □ 

Let A G C™ xn and assume that S(A) is the eigenvalue multiset of A. Then 



Corollary 7.12.7 Let A G C" x ™. Denote by S be the multiset con- 
sisting of all eigenvalues of A, counted with multiplicities. Assume that 
the traces of A k ,k g N are eventually nonnegative. Then the following 
conditions hold. 

1. p(A) is an eigenvalue of A. 

2. Assume that the algebraic multiplicity of p{A) is p. Let A be an eigen- 
value of A of multiplicity m(A) satisfying |A| = p(A). Thenm(X) < p. 

3. Assume that p(A) > and suppose that Ai = p(A), X 2 , . . . , A m are all 
the distinct eigenvalues of A satisfying the conditions |Aj| = p(A), m(Aj) = 
p for i = 1, ... ,m. Then = 1, . . . , m are the m distinct roots 



4. LetC,=e^^. Then CS(p(A)) = S. 

5. If p(A) > is an algebraically simple eigenvalue of A, and none of 
the other eigenvalues of A are positive, then S is a Frobenius multiset. 

Definition 7.12.8 A g R nxn is called eventually nonnegative if A k > 
for all integers k > N. 

Lemma 7.12.9 Let B g M™ xn . Then there exists a positive integer M 
with the following property. Assume that L > M is a prime. Suppose that 
B L is similar to a nonnegative matrix. Then the eigenvalue multiset of B 
is a union of Frobenius multisets. 

Proof. Associate with the eigenvalues of B the following set T C S 1 . 
For ^ A £ spec B we let Tjj g T. For A ^ k G spec B satisfying the 

conditions |A| = |k| > we assume that f g T. Let Ti C T be the 
set of all roots of 1 that are in T. Recall that r\ G S 1 is called a primitive 
fc-root of 1, if rj k = 1, and r/ k ^ 1 for all integers k' G [1, fc). k is called 



(7.12.7) 




o/l. 
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the primitivity index of 77. Let L > k be a prime. Then rj L is a fc-primitive 
root of 1. Furthermore, the map 77 i— > ?7 L is an isomorphism of the group of 
of all k — th roots of 1, which commutes with conjugation r\ i— > 77. Clearly, 
if 77 G S, and 77 is not a root of unity then ?7 L is not root of unity. Define 
M G N to be the maximum over all primitivity indices of 77 G TV If 
Ti = then M = 1. Let i > M be a prime. Assume that B L is similar 
to C G ]R" X ™. Apply Theorem 6.4.4 and the Perron-Frobenius theorem to 
each irreducible diagonal block of of the matrix in (6.4.3), to deduce that 
the eigenvalue multiset of S(C) is u'i~{Fj and each Fj a Frobenius multiset. 

Clearly, spec B = spec B and spec B L = spec C. Observe next that 
the condition that L is a prime satisfying L > M implies that the map 
z ^ z L induces a 1 — 1 and onto map <j) ■ s P ec B — > spec C. Moreover, 
0^ 1 (r) > if and only of r > 0. Hence <f> can be extended to a 1 — 1 and 
onto map <fi : S(B) — ► S(C). Furthermore, _1 (Fj) is a Frobenius set, where 
the number of distinct points Fj(r(Fj)) is equal to the number of points in 
0- 1 (Fj)(r(^ _1 (Fj)). Hence S(B) = uJ+^-^Fj) is a decomposition of S(B) 
to a union of Frobenius multisets. □ 



Corollary 7.12.10 Assume that a matrix B E W lXn is similar to an 
eventually nonnegative matrix. Then the eigenvalue multiset S(B) of B is 
a union of Frobenius multisets. 

Theorem 7.12.11 Assume that the eigenvalue multiset S(B) of B G 
R nxn is a union of Frobenius multisets. Then there an eventually nonneg- 
ative A G M" x ", such that S(A) = S(B). 

Proof. It is enough to show that for a given Frobenius multiset F there 
exists an eventually A G R nxn such that S(A) = F. The claim is trivial 
if F = {0}. Assume that r(F) > 0. Without loss of generality we can 
assume that r(F) = 1. Suppose first that F n S 1 = {1}. To each real 
point A G F of multiplicity m(A) we associate m(A) the diagonal matrix 
G(A) = A7 TO(A) G K m W xm W. For nonrcal points A G F of multiplicity m( A) 

we associate the block diagonal matrix H(X) = I m (\) ® 23?(A) |A| 
Note that H{\) = H(X). Let 

C := [1] ©A6FnR\{l} G{\) ©A£F,9A>0 H(X), 

C = C + Ci, Cq = [1] ® 0(„_i) X („_i), 

Ci - [0] 8 (®A€FnR\{l}G(A) ®AeF,3A>0 H(X)). 
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Clearly, 

S(C) = F, C0C1 = C1C0 = 0, C™ = Co, 

c m = + cr, P (d) < i, lim cr = 0. 

Let X e GL(n,R) be a matrix such XI = X T 1 = e 1 . Define 

A := X^CqX = -11 T , A- L = X~ 1 C 1 X, A := A + A x = X~^X. 
n 

So S(A) = S(C) = F. Also 

A m = A™ + A™, Aq" = A Q , lim A™ => lim A m = A . 

So A is eventually positive. 

Assume now that F is a Frobenius set with r(F) = 1 such that FflS 1 
consist of exactly to > 1 roots of unity. Let £ = e ™ . Recall that 
CF = F. Let F = Fx UF , where ^ Fi and F consists of m(0) copies of 0. 
(m(0) = -^=^> Fo = 0. If Fo ^ then the zero matrix of order m(0) has 
Fo as its eigenvalue multiset. Thus it is enough to show that there exists an 
eventually nonnegative matrix B whose eigenvalue multiset is Fi. Clearly, 
Fi is a Frobenius set satisfying r(Fi) = 1 and F n S 1 consist of exactly 
m > 1 roots of unity. Assume that all the elements of F 1; counted with 
their mulitplicity are the coordinates of the vector z = . . . , zn) t e C^. 
Let 

be the the elementary symmetric polynomials in z\ , . . . , zn ■ Hence the 
multiset Fi consists of the roots of P(z) := z N + J2k=i(~ l) fc crfc(z)z W ~'\ 
Since Fi = Fi it follows that each <7fc(z) is real. As £Fi = Fi we deduce that 
N = mN' and a k = if m does not divide k. Let F 2 be the root multiset 
Q( z ) : = z N ' + Efci(-l) rafc ^m(2)^'^. Clearly, F 2 - F 2 Since 1 e Fx 
it follows that 1 € F 2 . Furthermore, F x = _1 (F 2 ), where <p(z) : C — > C 
is the map z z m . That is, if z € F 2 has multiplicity m(z) then <f)~ 1 (z) 
consists of to points, each of multiplicity m(z) such that these m-points 
are all the solutions of w m = z. Hence F 2 n S 1 = {1}. Therefore F 2 is a 
Frobenius set. 

According to the previous case there exists an eventually nonnegative 
matrix A S M. N xN such that F 2 is its eigenvalue multiset. Let P e V m 
be a permutation matrix corresponding to the cyclic permutation on (m) 
i i ► for i = 1, . . . , m, where m+1 = 1. Consider the matrix B = P®A. 
Then B is eventually nonnegative, and the eigenvalue multiset of B is Fi 

□ 
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7.13 Inverse eigenvalue problem for nonneg- 
ative matrices 

The following problem is called the inverse eigenvalue problem for nonneg- 
ative matrices, abbreviated an IEPFNM: 

Problem 7.13.1 Let S C C be a multiset consisting ofn points, (count- 
ing with their multiplicities.) Find necessary and sufficient conditions such 
that there exists a nonnegative A e R nxn whose eigenvalue multiset is S. 

Proposition 7.13.2 Let A E R" xn . Then the eigenvalue multiset S 
satisfies the following conditions. 

1. S is a union of Frobenius multisets. 

2. All the moments of Sfc(S) > 0. 

3- s kl (S)>^foreachk,leN. 

Proof. 1 Follows from Theorem 6.4.4 and the Perron-Frobenius theorem 
applied to each irreducible diagonal block of of the matrix in (6.4.3). Since 
A k > it follows that tr A k > 0. Hence 2 holds. Since A k > it is enough 
to show the inequality in 3 for k = 1. Decompose A — [a^] as D + Ao, 
where D = diag(an, . . . , a nn ) and A := A - D > 0. So A 1 - D l > A l > 0. 
Hence tr A 1 > tr D l = Yl\=i a \i- Holder inequality for p — I yield that 

E"=i a « ^ (E"=i a u)^ nLjl . which y ields 3 - a 

The following result gives simple sufficient conditions for a mulitset S to be 
the eigenvalue multiset of a nonnegative matrix. 

Proposition 7.13.3 Let S C C be a multiset containing n elements, 
counting with multiplicities. Assume that the elementary symmetric poly- 
nomials corresponding to S satisfy ( — l)' c_1 cr / ! c (S) > for k = 1, ...,n. 
Then there exists A G M" x ™ such that S is the eigenvalue multiset of A. 

Proof. Note that the companion matrix to the polynomial P(z) = 
z n + X]"=i(~ l) I(7 i(S)z n_I is a nonnegative matrix. □ 

Recall the MacLaurin inequalities [HPL52, p' 52]. 

Proposition 7.13.4 Let w = (w 1; . . . , Wn-^ £ Then the se- 

quence (y^Ty) k nonincreasing for k = 1, . . . , n — 1. 
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Proposition 7.13.5 Let S be a multiset of real numbers, which con- 
tains exactly one positive number. Assume that the sum of all elements in 
S is nonnegative. Then S satisfies the conditions of Proposition 7.13.3. In 
particular, there exists A £ M" x ™ such that S is the eigenvalue multiset of 
A. 

Proof. Without loss of generality we may assume that 
S = {1, —wi, . . . , w n -i} where Wi > for i = 1, . . . , n— 1 and 1 > Y^i=i w %- 
Denote z = (l, — w 1 , . . . , —uin-^ and w = (w 1 , . . . , w„_!) T . Clearly 
°"i( z ) > °i ( — 1 ) n ~ 1 °~n( z ) = cr n-i( w ) > o. Observe next that 

(T fe+ l(z) = (-i) fc (cr fe (w) - cr fe+1 (w)) for k = 1, . . . , n - 2. 

Thus to prove that (— l) fc crfe +1 (z) > o it is enough to show that that the 
sequence <Tj(w),i = l, . . . , n — l is nonincreasing. 

Observe that <ri(w) < l. We now use Use Proposiiton 7.13.5. First 
observe that 



Next 




Hence S satisfies the conditions of Proposition 7.13.3. The last part of 
Proposition 7.13.3 yields that there exists A £ M™ x ™ such that S is the 
eigenvalue multiset of A. □ 

Example 7.13.6 Let S = {v^, \/— T, —■ v/— T}. Then S is a Frobenius 
set. Furthermore, s 2 (S) = and all other moments ofS are positive. Hence 
the condition 3 of Proposition 7.13.2 does not hold for k = 1,1 = 2. In 
particular, there is an eventually nonnengative matrix A £ M 3 ^ 3 , which 
can not be nonnegative, whose eigenvalue multiset is S. 

Theorem 7.13.7 Let S = {Ai,A 2 ,A 3 } be a multiset satisfying the fol- 
lowing properties. 
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1. r(S) G S. 

2. S = S. 

5. si(S) > 0. 
* (si(S)) 2 <3s 2 (S). 
Then there exist A e R^ x3 such that S is the eigenvalue multiset of A. 

Proof. Suppose first that Scl. It is straightforward to show that S 
is a union of Frobcnius multisets. In that case the theorem can be shown 
straightforward. See Problem 2. It is left to discuss the following renormal- 
ized case S = {r,c^ Ie ,c~^ T6 '}, where r > 1,9 e (0,7r). The condition 
Si(S) > yields that 

(7.13.1) 2cos6» + r>0. 

The condition (si(S)) 2 < 3s 2 (S) boils down to 

(r - 2cos(^ + 6)){r - 2cos(^ - 6)) > 0. 

For r > 1, £ (0, n) we have r — 2 cos(| + 9) > 0. Hence the condition 4 is 
equivalent to 



(7.13.2) 

Let U be the orthogonal matrix | 



r-2cos(- — 6>) > 0. 



V2 Vz -1 
V2 2 
\/2 -v/3 -1 



U T JU = diag(3, 0, 0). S is the eigenvalue set of B = 
Then A := U BU T is the following matrix 



and J — I3I3 • So 



r 
cos 9 sin 6* 
— sin 9 cos 9 



' 1 


1 — 1 


1 — 1 


2 


1 


1 


1 






I— 1 


I— 1 


~ 3 



— cos6> cos(? + 0) cos(? 



cos(f -(9) 



-cos 9 cos(f + 6») 



cos(f+6») cos(f-fl) -cos6» 
The above inequalities show that A > 0. □ 

A weaker version of the solution of Problem 7.13.1 was given in [BoH91]. 

Theorem 7.13.8 Let T C C\{0} be Frobenius mulitiset satisfying the 
following conditions. 
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1. T(r(T)) = {r(T)}. 

2. Sfe(T) > for k £ N. 

3. //sfe(T) > then s ki (T) > for all I £ N. 

Then there exist a square nonnegative primitive matrix A, whose eigenval- 
ues multiset is a union of T and m n > copies ofO. 

We prove the above theorem under the stronger assumption 
(7.13.3) Sfe(T) > for k > 2, 

following the arguments of [LaflO]. 



Lemma 7.13.9 Let A n £ 



be the following lower Hessenberg ma- 



trix 



(7.13.4) 



si 

S2 
S3 



1 

Si 
S2 
S3 




2 
si 



S n -1 S n -2 
Sn S n —i 



52 s\ n — 1 

53 s 2 Si 



Let S = {Ai, . . . , A n } C C be the unique multiset such that s k = Sfc(S) 
for k = 1, . . . , n. Let ai, . . . , a n be the n- elementary symmetric polynomials 
corresponding to S. Then the characteristic polynomial of A n is given by 

(7.13.5) det (zl n - A n ) = z n + J2(-l)Hl(^ja l . 

Proof. Recall the Newton identities. 

k-i 

si = cri, s fc = (-l) fe_1 fccr fc + y^^-iy^ajSk-j for k = 2, . . . ,n. 



Let p(z) be the polynomial given by the right-hand side of (7.13.5). Denote 
by C(p(z)) £ C nxn the companion matrix corresponding to p(z). Let Q = 
[qij] £ C nxn be the following lower triangular matrix. 



(.7-1)! 



1, . . . , n, where uq := 1. 
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Use the Newton identities to verify the equality A n Q = C(p(z))Q. Hence 
A n is similar to C(p(z)), and the characteristic polynomial of A n is given 
by (7.13.5). □ 

Proof of Theorem 7.13.8 under the assumption 7.13.3. Let T = 

{Ai, . . . , A n } be a multiset in C. Denote by <7i, . . . , a n the elementary sym- 
metric polynomials corresponding to T. Let p(z) = z 11 + Y^i=i(~ l) 1 ^™ -4 
be the normalized polynomial whose zero set is T. For m e N denote 
S m := T U {0, . . . , 0}. Let cr^ m be the i — th elementary symmetric poly- 

m 

nomial corresponding to S m for i = 1, . . . ,n + m. Then <7 iiTO = er^ for 
i = 1, . . . , n and Oi^ m — for i = n + 1, . . . ,n + m. The Sfc(T) = Sk(S m ) for 
all k e N. Denote by A n+m e c(n+m)x(n+ m ) thc matrix (7.13.4), w hcrc 
Sfc = Sfe(T) for k = 1, . . . , n + m. Observe that 

n i . _ 

det (z/„ +m — A n+m ) = (z n + y2(T[(l - -?-—)){-l) i *iZ n - i )z m . 

n + m - LJ - n + m 

i=i j=i 

Let 

n i - 1 

(7.13.6) Pm (z) := z n + - A —r 1 )(-iy° l z n - 1 . 

i=i j=i 

Denote by T m = {Ai jTO , . . . , A„ !m } the multiset formed the n zeros of p m . 
Since lim m ^ 00 p m (z) = we deduce that lim m ^oo pdist(T m , T) = 0. 
That is, we can rename Ai jm , . . . , A„. m , m e N such that limm^oo Aj im = A^ 
for i = l,...,n. Let B m+ „ e c( m+ ™) x ( m +") be the matrix defined by 
(7.13.4), where Sk,k — 1, . . . , m + n are the moments corresponding to T m . 
Then det (zl n+m — , m B n+m ) — z m p(z). Thus, if the first n + m moments 
corresponding to T m are nonnegative, it follows that that the multiset S m 
is realized as an eigenvalue set of a nonnegative matrix. 

We now show that the above condition holds for m > N, if T satisfies 
the assumption 1 of Theorem 7.13.8 and (7.13.3). It is enough to consider 
the case where T = {Ai = 1, A 2 , . . . , A n }, where 1 > |A 2 | > . . . > |A„|. Let 
e := — j^-. First we choose M big enough such that after renaming thc 
elements of the multiset of T m we have that |Aj >m — X%\ < e for i = 1, . . . , n 
and m > M. Note that since T m — T m it follows that Ai, m G M and 
Ai, m > 1 — e for m > M. Furthermore, |A ijOT | < 1 — 3s for i = 2, . . . , n. 
Hence 

s fe (T m )>(i-e) fe (i-(n-i)(^) fc ). 
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Thus for k > k(e) := ^^^^^ and m > N we have that s k (T m ) > 
0. Clearly Sl (T m ) = 8l (T) and linw^ s k (T m ) = s k (T) for k = 2, . . . , \k{e)] 
Since s k (T) > for n > 1 we deduce the positivity of all s k (T m ) for all 
£; > 1 if to > iV > M . □ 



It is straightforward to generalize this result to a general Frobenius 
multiset. See [Fri09]. 

Theorem 7.13.10 Let T C C\{0} be a Frobenius mulitiset satisfying 
the following conditions. 



1. T(r(T)) = {r(T), Cr(T), . . . , C m_1 r(T)} for ( = where m > 1 
is an integer. 

2. s fe (T) >Q for ken. 

3. //sfe(T) > then s kl (T) > for all I e N. 

Then there exist a square nonnegative irreducible matrix A, whose eigen- 
values multiset is a union of T and m n > copies ofO. 

Proof. Observe first that s fe (T) = if m /ffc. Let : C -> C be the 
map z i— > z m . Since ^T = T it follows that for z e T with multiplicity 
m(z) we obtain the multiplicity z m in </>(T) is mm(z). Hence 4>(T) is 
a union of m copies of a Frobenius set Ti, where r(Ti) = r(T) m and 
Ti(r(Ti)) = {r(Ti)}. Moreover s km (T) = ms k (Ti). Hence Ti satisfies 
the assumptions of Theorem 7.13.8. Thus there exists a primitive matrix 
B e M™ x ™ whose nonzero eigenvalue multiset is Ti. Let A = [i i3 ]™ 3=1 be 
the following nonnegative matrix of order ran. 



(7.13.7) 



Onxn In Onxn Onxn 

Onxn Onxn In Onxn 

Onxn O n xn Onxn Onxn 

^ nX n Onxn nX n 



nX n 
nX n 

In 

nX n 



Then A is irreducible and the nonzero part of eigenvalue multiset if T. (See 
Problems 4 and 5.) 



Problems 
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1. Definition 7.13.11 Let S C C be a finite multiset. S is called a semi 
Frobenius multiset if either S has m elements all equal to 0, or the 
following conditions hold. 

(a) r(S) >0, S = S, r(S) g S. 

(7>j to(z) < fi:= m(r(S)) /or eac/i z g S swc/i £/iai |z| = r(S). 
(qj ^4ss«me i/iat S contains exactly m distinct points satisfying \z\ = 
r(S),m(z) = jit. Then (S — S /or £ = e . 

S is called an almost a Frobenius multiset if the the number points in S, 
counted with their mulitplicites, satisfying z g S, |z| = r(S),m(z) < 
is strictly less than m/i. 

Let / s by (7.12.6). Show 

(a) S = {1, 1, z, z}, with \z\ = 1, z ^ ±1 is a semi Frobenius multiset, 
and /s has nonnegative moments. 

(b) Let S = Uj =1 Si, where each S; is almost a Frobenius multiset. 
Then /g has eventually nonnegative MacLaurin coeffients. 

(c) Assume that the MacLaurin coefficients /g are eventually non- 
negative. Then r(S) g S. Suppose furthermore that < a < 
r(S) is the second largest positive number contained in S. Then 
S n {z g C, a < |z| < r(S)} is a semi Frobenius set. 

(d) Assume that the MacLaurin coefficients fs are eventually non- 
negative. Suppose that S contains only one positive number of 
mulitplicity one. Then S is semi Frobenius. 

(e) Assume that the MacLaurin coefficients fs are eventually non- 
negative. Suppose that S contains only two positive number 
of mulitplicity one each: r(S) > a > 0. Decompose S to 
Si U S2, where Si is a maximal semi Frobenius set containing 
{z g C, a < \z\ < r(S)}. If a g Si then S 2 = 0. Suppose that 
a G S2. Then S3 := S2 H {z g C, |z| = a} is a set, i.e. m(z) = 1 
for each z g S3. Assume for simplicity of the exposition that 
a = 1. Let to' g [1, Z) be the greatest divisor of to > 1, entering 
in the definition of the Frobenius multiset Si, such that all to' 
roots of 1 are in S3. Let m" := ^ > 1. Then there exists r g N 
coprime with to" such that one of the following conditions hold. 

i. If to" is even then S3 = S4, where S4 consists of all m'r 
roots of 1. 

ii. If to" is odd then cither S3 = S4 or S3 = S4 U S5, where 
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Hint: Use the function g in the proof of Theorem 7.12.6, or/and 
consult with [Fri78b, Thm 4] and its proof. 

2. Let S = {Ai, . . . , A n } c K be a union of Frobcnius multiset. Assume 
furthermore that Ym=i ^* — 0- Show that if n < 4 then there exists a 
nonnegative n x n matrix whose eigenvalue multiset if S. Hint: For 
n = 3 use Proposition 7.13.5. For n = 4 and the case where S contains 
exactly two negative numbers consult with the proof of [LoL78, Thm. 
3]. 

3. Show that for n > 4 the multiset S := {\/2, \/2, y/^l, -y/^l, 0, . . . , 0} 
satisfies all the conditions of Proposition 7.13.2. However there is no 
A e K" xn with the eigenvalue set S. 

4. Let B G M™ x " be a primitive matrix. Show that the matrix A G 
R rnnxmn dcnncd (7.13.7) is i rre ducible for any integer m > 1. 

5. Let _B e C" x ™. Assume that T is the eigenvalue multiset of B Assume 
that A e C mnxmn is defined by (7.13.7). Let S be the eigenvalue 
multiset of A. Show that w € S if and only if w' m e T. Furthermore 
the multiplicity of ^ w e S equals to the multiplicity of w m in T. 
The mulitplicity of G S is to times the multiplicity of G T. 

7.14 Cones 

Let V be a vector space over C. Then V is a vector space over R, which 
we denote by Vr, or simply V when no ambiguity arises. See Problem 1. 

Definition 7.14.1 Let V be a finite dimensional vector space overF = 
K, C. A set K C V is called a cone if 

1. K + KcK, i.e. x + y G K for each x,y G K. 

2. R + K C K, i.e. ax G K for each a G [0, oo) and x G K 
Assume that K C V is a cone. (Note that K is convex set.) Then 

1. ri K, dim K, is the relative interior and the dimension of K , viewed 
as a convex set. 

2. K* := {f G V*, Jif(x) > o for all x G V} is called the conjugate 
cone, (in Y*). 

3. K is called pointed i/KflK = {0}. 
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4- K is called generating if~K — K = V, i.e. any z S V can 6e represented 
as x — y /or some x, y G K. 

5. K is called proper i/K is closed, pointed and generating. 

6. For x, y G V we denote: x > K y i/ x — y E K; x > K y t/x > K y 
and x ^ y; x > K y i/x - y G ri K. 

7. For x e V we caZi: x nonnegative relative to K i/ x > K 0; x is 
semipositive relative to K i/ x > K 0; x is positive relative to K i/ 
x G ri K. IF/ien £/iere is no ambiguity about the cone K we drop i/ie 
term relative to K. 

C K is called a subcone of K if K x is a cone in V. F C K is 
called a face of K, if F is a subcone of K, and y G F i/y e K and 
t/iere exists x e F swc/i i/iat x > K y. dim F, the dimension ofF, is 
called the dimension o/F. F = {0},F = K are called trivial faces, 
(dim. {0} = 0). x > is called an extreme ray i/R+x is a /ace in K, 
(of dimension I). For a set X C K, the face F(X) generated by X, is 
i/ie intersections of all faces o/K containing X. 

5. Le£ T e Horn (V,V). T/ien: T > K 0, and T is ca//ed nonnegative 
with respect to K, i/ TK C K; T > K 0, and T is called semipositive 
with respect to K, if T > K and T ^ 0; T > K 0, and T is called 
positive with respect to K, i/ T(K\{0}) C ri K. T > K is called 
primitive with respect to K, if F is a /ace of K satisfying TF C F, 
i.e. F is T invariant, then F is a trivial face of K. T is called 
eventually positive with respect to K i/ T' > K for all integers I > 
i(> 1). When there is no ambiguity about the cone K we drop i/ie 
£erm relative to K. Denote by Horn (V, V) K £/ie set o/ all T > K 0. 
For T,S e Horn (V,V) we denote: T > K S <^> T - S > K 0, 
T > K S ^ T - S > K 0, T > K S T-S> K 0. 

As pointed out in Problem 3, without loss of generality we can discuss only 
the cones in real vector spaces. Also, in most of the applications the cones 
of interest lie in the real vector spaces. Since most of the results we state 
hold for cones over complex vector spaces, we state our results for cones in 
real or complex vector spaces, and give a proof only for the real case, when 
possible. 

Lemma 7.14.2 Let V be a finite dimensional vector space over ¥ = 
R, C. Let K be a cone in V. Then V = K — K, i.e. K is generating, if and 
only if the interior o/K is nonempty, i.e. dim K = dim rV. 
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Proof. It is enough to assume that V is an n dimensional vector space 
over E. Let k = dim K. Then span K is a k dimensional vector space in V. 
Assume that K — K = V. Since K — K C span K we deduce that dim K = n. 
Hence K must have an interior. 

Assume now that K has an interior. Hence it interior must contain 
n linearly independent vectors x x , . . . ,x n which form a basis in V. So 
Y^i=i a i x i "= K f° r an y a i> • ■ • i a « — 0- Since any z e V is of the form 
X]" = i z i x i is °f the form J2 Z >o z i x i~J2 z <o(~ z *) x i we deduce that K— K = 

V. ._ . □ 



Theorem 7.14.3 Let K C V be a proper cone over F = E, C, where 
dim V e [i,oo). Then the following conditions holds. 

1. There exists f e K* which is strictly positive, i.e. 3?f(x) > o if 
x > K 0. 

2. Every x > is a nonnegative linear combination of at most dim fU 
extreme rays o/K. 

3. The conjugate cone K* C V* is proper. 

Proof. It is enough to assume that V is a vector space over E. Observe 
that for any u > K the set I(u) := {x e V, u > K x > K — u} is a compact 
set. Clearly I(u) is closed. It is left to show that that I(u) is bounded. 
Fix a norm || • |j on V. Assume to the contrary that there exists a sequence 
7^ x m e C such that lim m ^oo |jx m || = oo. Let y m = m\| X to , m G N. 
Since ||y m || = l, m € N it follows that there exists a subsequence m^, k € N 
such that lim^oo y mfc = y, ||y|| = l. Since y m e I(^^u) it follows that 
y G 1(0). So y € Kn— K = {0} which is impossible. Hence I(u) is compact. 

Choose u e ri K. We claim that u is an isolated extreme point of I(u). 
Since u e ri K it follows that there exist r > so that u + x e K for each 
||x|| < r. Suppose that there exist v, w e I(u) such that tv + (l — t)w = u 
for some t £ (0,1). So v = u — v 1; w = u — v 2 for some v 1 ,w 1 > K 0. The 
equality u = (l — t)v + tw yields = (l — t)v x + tw 1 > (l — t)^^ > 0. 
Hence v x = o. (See Problem 2). Similarly, w x = 0. Hence u is an extreme 
point. 

We now show that for any x e U such that x > K 0, ||x|| < r the point 
u — x is not an extreme point of I(u). Indeed, u — |x, u — ^x e I(u) and 
u — x = i(u — |x) + i(u — i)x. Since u is an isolated extreme point, 
Corollary 7.1.10 yields that u is exposed. Hence there exists f e U* such 
that f(u) > f(u — x) for any x > K satisfying ||x|| < r. So f (x) > o for 
any x > K satisfying ||x|| < r. Hence f(y) = ^/(jj^jyy) > f° r an Y 
y > K 0. This proves the part 1 of the theorem. 
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Let C = {x > K 0, f(x) = i}. Since K is closed, it follows that C 
is a convex closed set. We claim that C is compact, i.e. bounded. Fix 
a norm || • || on U. Assume to the contrary that there exists a sequence 
x m e C such that lin^^oo ||x m || = oo. Let y m = ^^x m ,m G N. Since 
||y m | = i, to G N it follows that there exists a subsequence nik, k G N such 
that lim^oo y„ lk = y, ||y|| = l. Since K is closed it follows that y > K 0. 
Note that 

^ k^oo^^ k-^oo ||xfe|| fc^+oo ||x m J 

This contradicts the assumption that f is strictly positive on K. Thus C is 
a convex compact set. We next observe that dim C = n — 1. First observe 
that f(C — x) = o for any x G C. Hence dim C < n — 1. Observe next 
that if f(z) = o and ||z|| < r then u + z G C. Hence dim C = n — 1. Let 
w > K 0. Define w ± = f^;y w <= C. Caratheodory theorem claims that w ± 
is a convex combination of at most n extreme points of C. This proves the 
part 2 of the theorem. 

Let R := {max||x||, x G C}. Let g G U*,||g||* < ^. Then for x e C 
|g(x)| < l. Hence (f + g)(x) > l - |g(x)| > o. Thus f + g e K*. So f is 
an interior point of K*. Clearly K* is a closed and a pointed cone. Hence 
part 3 of the theorem hold. □ 



Theorem 7.14.4 Let V be a vector space over ¥ = R, C. Assume that 
K C V be a proper cone. Assume that T G Horn (V,V) K . Let S(T) C 
C be the eigenvalue multiset of T, (i.e. the root set of the polynomial 
det (zI-T).) Then 

1. p(T)GS(T). 

2. Let A G S(T)(p(T)). Then index (A, T)) < k := index (p(T),T). 

3. There exists x > K such that Tx = p(T)x, and x G (p(T)7 - 
T) fe_1 V. 

^. 7/T > K f/ien p(T) > 0,k = l,S(T)(p(T)) = {p(T)} and p{A) is 
a simple root of the characteristic polynomial of T . (This statement 
can hold only if¥ = R.) 

Assume in addition that p(T) = 1. Let P G Horn (V, V) be the spectral 
projection, associated with T, on the generalized eigenspace corresponding 
to 1. Then 

(7.14.1) lim — ^ V T l = (T - if^P > K 0. 

i=0 
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Assume finally that F = R, A ^ 1, |A| = 1 is an eigenvalue of T of index 
k. Let P{\) <G Horn (V, V) be the spectral projection, associated with T, on 
the generalized eigenspace corresponding to A. View -P(A) = Pi + \/—lP 2 , 
where Pi, P 2 e Horn (V, V). Then 

(7.14.2) |f ((T- A/) fc - 1 P(A)y)| < f ((T-I) k ^Py) for any y e K, f e K*. 

Proof. Suppose first that p(T) = 0, i.e. T is nilpotent. Then parts 1-2 
are trivial. Choose y > K 0. Then there exists an integer j e [0, k — 1] so 
that T j y > K and T J+1 y = 0. Then x := T j y is an eigenvector of A 
which lies in K. Since Ax = it follows that A can not be positive. 

From now on we assume that p(T) > 0, and without loss of generality 
we assume that p(T) = 1. In particular, dim V > l. Choose a basis 
bi, . . . , b n in V. Assume first that F = R. Then T represented in the basis 
b 1; . . . ,b n by A = [a y ] e R" x ™. Consider the matrix B(z) = (I - zA)^ 1 = 
[bij]2 = j =1 C(z) nxn . Using the Jordan canonical form of A we deduce that all 
the singular points of all bij(z) are of the form p, := 4 where A is a nonzero 
eigenvalue of A. Furthermore, if O^Ag spec (A), and A has index / = /(A). 
Then for each i, j, bij (z) may have a pole at p of order I at most, and there 
is at least one entry bij(z), where i = i(A),j = j(A), such that bij(z) has 
a pole of order I exactly. In particular, for each x,y e R™ the rational 
function y T i?(z)x may have a pole of order I at most p. Furthermore, 
there exists x,y G R™, x = x(A),y = y(A) such that y T i?(z)x has a pole 
at p of order I. (See Problem 7.) 

Let K e R" denote the induced cone by K C V. Then K is a proper 
cone. Denote by 

K* := {y e R",y T x > o for all x G K*}. 

Theorem 7.14.3 implies that K* is a proper cone. Observe next that AK C 
K. Clearly, we have the following MacLaurin expansion 

(7.14.3) B(z) = (I- zA)- 1 = J2 z'A\ for \z\ < 

OO 

(7.14.4) y T B(z) X = ^(y T A 4 x)z J , for \z\ < 

i=o ^ 

Note that y T B(z)x is a rational function. Denote by r(x,y) e (o, oo] the 
convergence radius of y T B(z)x. So r(x,y) = oo if and only if y T _B(z)x 
is polynomial. Assume first that x € K,y e K*. Then the MacLaurin 
coefficients of y T _B(z)x arc nonncgativc. Hence we can apply the Vivanti- 
Pringshcim theorem 7.12.2, i.e. r(x,y) is a singular point of y T £>(z)x. 
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Hence r ^ y ^ € spec (A) if r(x, y) < oo. Suppose now that x,y G R™. Since 

K and K* are generating it follows x = x + x , y = y + — y_ for some 
x+,x_ e K, y+,y_ € K*. So 

y T B(z)x = y^B(z)x+ + y T _B{z)x_ - ylB(z)x + - y^B(z)x_. 

Hence 
(7.14.5) 

r(x,y) > r(x+,x_,y+,y_) := min(r(x+, y+), r(x_, y_), r(x+, y_), r(x_ , y+ 

Let A £ spec (A),|A| = p(A), and assume that / = index (A). Choose 
x,y such that p = \ is a pole of y T £>(z)x of order I. Hence we must 
have equality in (7.14.5). More prcsiccly, there exists x ± £ {x + ,x_},y! £ 
{y+,y_} such that r(x,y) = r(x 1 ,y 1 ) and yj B(z)x ± has a pole of at 
p of order I. Vivanti-Pringsheim theorem yields that r(x 1 ,y 1 ) is pole of 
order k' > I of yj B(z)x 1 . Hence p(A) £ spec (A) and index (p( A)) > 
k' > I = index (A). This proves parts 1-2. Choose A = p(A) that satisfies 
the above assumptions. Hence B(z)x 1 must have at least one coordinate 
with a pole at p(A)^ 1 of order k — index (p(A)). Problem 7 yields that 
lim t/ i B(tp(A)~ 1 x 1 = u^0 such that Au = p(A)u, and u £ (p(A)I - 
A) k - 1 R n . Use the fact that for z = tp(A)- 1 ,t £ (0, 1) we have the equality 
(7.14.3). So (l-t) fc J B(t i o(A)- 1 )x 1 eK for each t£ (0,1). Since K is closed 
we deduce that u £ K. This proves part 3. 

The equality (7.14.1) follows from the Tauberian theorem 8 and it ap- 
plication to the series (7.14.3). 

Assume now that A > K 0. Observe first that the eigenvector x > K 

0, Ax = x satisfies x > K 0, i.e. x <E ri K. Next we claim that the dimension 
of the eigenspace {y, (A — I)y = 0} is 1. Assume to the contrary that 
Ay = y and x,y are linearly independent. Since 6.7 .limparexpbzx(so) = 
Ax(s a ) > K we obtain a contradiction. Hence x is a geometrically simple 
eigenvalue. 

Nest we claim that k = index (1) = 1. Assume to the contrary that k > 

1. Recall that x £ (A-I) k - 1 M n . So x = (A-I)y. Hence x = (A-I)(y + 
tx). Choose t > big enough so that z = y + tx = t(|y + x) > K 0. Since 
x > K it follows that there exists s > such that (A-I)z-rz = x — rz > K 
0. That is Az > ft (l + r)z. Hence A m z > (l + r) m z A) m z >^ z. 

Since p(j^A) = ^ < 1 it follows that = lim m ^ co ( T ^ A) m z 
0, which is impossible. Hence index (1) = 1. 

We now show that if A € spec (A) and |A| = 1 then A = 1. Let J := 
{y€K,||y|| 2 = i}. Since K is closed it follows that J is compact set. Since 
A > K it follows that AJ £ ri K. Hence there exists s £ (0, 1) such that 
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Ay — sz > K for any y £ J,z £ R™, ||y|| 2 = i. In particular, Ay — sy > K 
for each y £ J. That is (A - sl)3 £ ri K. Hence (A - si) > ft 0. Note 
that (A — sl)x = (l — s)x > K 0. So p(A — sJ) = 1 — s. Each eigenvalue 
of A — si is A — s, where A £ spec (A). Apply part 1 of the theorem to 
deduce that |A — s| < 1 — s. Since for any £ £ S\{1} we must have that 
|C — s\ > |C| — s = 1 — s, we obtain that S(A)(1) = {1}, which concludes 
the proof of part 4 for F = R. 

Assume next that p(T) = p(A) = 1, and k = index (1). (7.14.11) of 
Problem 9 yields the equality in (7.14.1). Since any sum in the left-hand 
side of (7.14.1) is nonncgativc with respect to the cone K it follows that 
(A — I) k ~ 1 P > K 0. Let Ai = 1 so s\ = k, see notation of Problem 7. Recall 
that (A — I) k ~ 1 P in the basis b 1; . . . ,b„ is represented by the component 
^i(fe-i) ^ 0. Hence (A - if^P > K 0. 

Assume finally that A £ spec (T), |A| = 1,A ^ 1, index (A) = k. Let 
P(A) be the spectral projection on the eigenvalue A. Then (7.14.11) yields 

(7.14.6) lim —r V X r T r = X^tT - A/)' £ - 1 P(A). 

m— >oo m 

r-0 

Let y £ K,f £ K. Since f(T r y) > o and |A| = 1 we obtain |f(A r T r y)| = 
f(T r y). The triangle inequality 

r— r— o 

Let m — > oo and use the equalities (7.14.6) and (7.14.1) to deduce (7.14.2). 

We now point out why our results hold for a vector space V over C. 

Let T £ Horn (V, V) and assume that b 1; . . . ,b n is a basis V. Then Vr 

has a basis b lt . . . ,b n , \f— ib 1; . . . , \J— ib n . Clearly T induces an operator 

f e Horn (Vm, Vr). Let A e C" x ™ represents T in the basis b 1; . . . ,b„. 

Observe that A = B + ^/^lC, where B, C £ W lXn . Then f is presented by 
1^ £j 

in the basis h ± b n , y/^ib 1 ^/^ib r , 



the matrix A = 
See Problem 10. 



C B 



Problems 



1 . Let V be a vector space of dimension n over C, with a basis z lt . . . , z. 
Show. 
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(a) V is a vector space over R of dimension 2n, with a basis 

'l^.^/^-'l^, . . . , z„, ^/^VL n . We denote this real vector space by 
Vr and its dimension dim R V. 

(b) Let the assumptions of la hold. Then V* can be identified with 
(Vr)* as follows. Each f E V* gives rise to f E (V R )* by the 
formula f (z) = 3?f (z). In particular, if f 1; . . . ,f„ form a basis in 

V* then fi, \/~ lf l7 . . . , f„, V^lfn is a basis in (Vr)*. 

2. Let K be a cone. Show that K is pointed if and only the two inequal- 
ities x > K y,y > K x imply that x = y. 

3. Let the assumptions of Problem 1 hold. Assume that K C V is a 
cone. Denote by Kr the induced cone in Vr. Show 

(a) K is closed if and only if Kr is closed. 

(b) K is pointed if and only if Kr is pointed. 

(c) K is generating if and only if Kr is generating. 

(d) K is pointed if and only if Kr is pointed. 

4. Let U be a real vector space. Denote by U c as in Proposition 4.1.2. 
Assume that K C U is a cone. Let Kc := {(x,y), x,y E K}. Show 

(a) Kc is a cone in Uc- 

(b) K is closed if and only if Kc is closed. 

(c) K is pointed if and only if Kc is pointed. 

(d) K is generating if and only if Kc is generating. 

(e) K is proper if and only if Kc is proper. 

5. Let the assumptions of Problem 4 hold. Assume that A E Horn (U, U) K . 
Define A : U c — > U c by A(x,y) = (Ax, Ay). Show 

(a) ieHom(Uc,U c ) Kc . 

(b) det (zl — A) = dct (zl - A). 

(c) A is not positive with respect to Kc- 

6. Let V be a vector space over F = R, C. Assume that K C V and 
A E Horn (V, V) K . Then A* E Horn (V*, V*) K *. 

7. Let A E C" xn . Assume that S(A) = {Ai, . . . , A n } is the eigenvalue 
multiset of A. Consider the matrix 

B(z) = (I-zA)- 1 = [&tf]? =j=1 € C(z) nxn . Show 
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(a) Let Zn, . . . , Zi( Si _ty,i = 1, . . . , £ be all the matrix components 
of A as in §3.1. Then 

i. Z i0 is the spectral projection of A on Aj. (See §3.4.) Fur- 
thermore 



(Hint: To show the first equality use (3.4.1) by letting A = \ 
and divide (3.4.1) by z.) 

(b) All the singular points of all bij(z) are of the form fj, := i where 
A is a nonzero eigenvalue of A. Furthermore, if ^ A € spec (A), 
and A has index I = l(X). Then for each i, j bij(z) may have a 
pole at \i of order I at most, and there is at least one entry bij(z), 
where % = i(X),j = j(X), such that hj(z) has a pole of order I 
exactly. Furthermore Suppose furthermore, that for x e C", at 
least one of the entries of B(z)x has a pole of order I at /i. Then 
linVi(l-t)'B(t/i)x = y + 0, Ay = Ay and y e {XI - A) l ~^<C n . 

(c) Let ej = (S^, . . . ,5 in ) T ,i e (n). For each ^ A e spec (A) of 
index I = l(X) there exists e^e^, i = i(X),j = j(X) such that 
ej B(z)ej has a pole at j of order I exactly. 

8. Let k,l eN and consider the rational function 




n. 



(7.14.8) (I - zA)-i = £ £(7^+1^ 

»=i j=o v 4 ; 

(7.14.9) Iim(l -t)"(J- I A)' 1 = ^Z i(si _ iy 



l Si-l 



(l-Hz) 1 



1 




Show 



(a) 
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Hint: Use the Riemann sums for the integral J Q x k 1 dx to show 



, m 

Mm * yv-i = i. 

m^oo n 



(b) 



i=0 

Under the following assumptions 
i. \fi\ = 1, /i ^ 1 and Z = fc. Hint: Recall the identity 

m-1 m-1 1 _ , 



Show that 

1 rf^ 1 , , ^\ . .7 -7 



(fe- 



^ Sro+fc _ 1 (,) = ^(-ir . y. 

i— n V / 



ii. = 1 and I < k. Hint: Sum the absolute values of the 
corresponding terms and use part 8a. 

iii. \fi\ < 1. Hint: Use the Cauchy-Hadamard formula to show 
that IXo 1 It/) I \rf<<x>- 

9. Let the assumptions of Problem 7 hold. 

(a) For m > max(si, . . . , se) 
(7.14.10) 

E^ r = EE^'( E (-i) r ("V )^ rz «)- 

r=0 i=l ^=0 r=0 ^ ' 

(Use the first m terms of MacLaurin expansion of both sides of 
(7.14.8).) 

(b) Assume furthermore that p(A) — 1 and k is the maximal index of 
all eigenvalues Ai satisfying |Aj| = 1. Assume that A = Ai, |A| = 
1, and k = index (Ai) = s\. Let P(A) = Z w be the spectral 
projection on A and Zi( Sl _i) = (A — A) fe_1 P(Ai). Then (7.14.10) 
and Problem 8 implies. 

u ™~} _ 

(7.14.11) lim ^ X r A r = X k -\A - A/) fc - 1 P(A). 

m->oo to 
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10. Let V be a vector space over C with a basis b 1; . . . ,b„. Then Vr 

has a basis b 1; . . . , b„, ^ibn . . . , \f^ih n . Let T e Horn (V, V). 

Show Clearly T induces an operator T £ Horn (Vr, Vr). Let A e 

C" x ™ represents T in the basis b 1; . . . , b„. Observe that A = B + 

where B,C e M" xn . Then f is presented by the matrix 
I j q 

A= „ „ in the basis b 15 b„, \/ = ib 1 , \/^ib n 
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